oneinfra is an open source project to manage and run multiple Kubernetes clusters across different public clouds, private clouds, and bare metal.
The primary pieces of an oneinfra installation are "hypervisors", cluster abstractions, and components. A hypervisor machine must be running a Container Runtime Interface (CRI) implementation. A cluster abstraction represents a Kubernetes cluster, including the control plane and its ingresses. There are other components - belonging to the control plane and control plane ingress - that run on top of these. The control plane components include typical Kubernetes master node pieces - etcd, API server, Scheduler etc, whereas the ingress components include haproxy and a VPN endpoint. oneinfra can create different clusters with different versions in a declarative way, allowing one to use different Kubernetes versions at the same time. It is similar to an open source GKE or EKS.
InfoQ got in touch with Rafael Fernández López, software architect and author of oneinfra, to find out more about this project.
According to López, the main gap that oneinfra fills is "to provide a very simple system to set up, that allows you to create and destroy isolated Kubernetes control planes at will, without the need of creating dedicated infrastructure for them". oneinfra can use underlying infrastructure and machines from various cloud providers, including bare metal instances, to create the control plane instances. López explains some best practices around this:
You can use different cloud providers for creating your control plane instances. However, there are operational challenges when it comes to splitting a single control plane across different public clouds or service providers, and so the recommendation is to place all components for a control plane on the same service provider, but nothing stops you from being able to create different control planes on different service providers.
Image courtesy - https://github.com/oneinfra/oneinfra (used with permission)
A hypervisor in oneinfra parlance is a "physical or virtual machine where oneinfra will create the control plane components". A hypervisor node in oneinfra has to have a Container Runtime Interface (CRI) implementation running. A hypervisor can be "public" and run the ingress components, or be "private" and run the control plane components. A service wrapper over the CRI implementation is required on a hypervisor node to connect to oneinfra. López explains that this process will become easier in future versions:
It is part of the roadmap to ease the way you create new hypervisors -- something like a `oi hypervisor join` command will be added, akin to the current `oi node join` command. The latter talks to a managed cluster in order to join it, whereas the former will talk to the management cluster and join as an hypervisor.
The system has a "reconciler" module - which is a set of controllers that does a number of things. It schedules control plane components on hypervisors, and creates ones that are defined but missing. It also deletes control plane components that were deleted by the user - thus bringing the system to the desired state. The reconciler handles worker node join requests against managed clusters and ensures that RBAC rules are correctly set up.
The current architecture has each control plane instance isolated - so Kubernetes master node software like etcd cannot be shared, or replaced by another persistence layer. Performance aspects like benchmarking how many control planes can fit in one hypervisor also need to be worked out, says López. Another future improvement is the ability for worker nodes to be on heterogenous networks.
The oneinfra source code is available on GitHub.