Seeking Advice on Setting Up a High Availability K0s and K0smotron Lab #661

tdesaules · 2024-08-01T13:36:51Z

Hello community,
I want to create a small lab to test k0s and K0smotron. I will use several SBCs and aim to build the following setup:

Management Cluster

1 k0s management cluster with:
• 3 k0s nodes as “controller+worker”

Child Clusters

Then to deploy k0smotron with 2 small child clusters:
• 4 k0s nodes as “worker”
• 2 load balancer nodes (to simulate a dedicated load balancer)

Architecture Questions

I have some architectural questions regarding the high availability of the control plane on the k0s management cluster and on the control plane of the child clusters managed via k0smotron.

On the management cluster, I do not have an external load balancer, so I wanted to try using the “Node-local load balancing” (NLLB) and “Control plane load balancing” (CPLB) features.

Is this resilient on “controller+worker” nodes?

Next, I wonder how this works for the control plane of the child clusters managed by k0smotron within the management cluster with the previous setup.

Should this work? I understand that the workloads of the management cluster hosting the k0s control planes of the child clusters should then behave like any other workload in such a configuration, and the control plane of the management cluster would be resilient regarding the child cluster control plane.

Then, how to address the high availability of the control planes of the child clusters? I have looked at this documentation: k0smotron HA where it is recommended to use Kine. In a recent Medium post: k0smotron is growing up, I read:

External etcd Support for Control Plane HA
To enhance high availability (HA) for hosted control planes (running a k0s control plane in pods), k0smotron 1.0 will now deploy etcd in a separate pod (and statefulset) from the hosted control plane component. Previously, running a highly available hosted control plane (i.e., multiple containerized controllers, deployed to different fault domains) was challenging, due to potential split-brain scenarios as etcd (effectively part of each control plane) got scaled up and down. With the new update, etcd is managed independently of other HCP components (in a separate set of pods), letting it scale independently. etcd can also be snapshotted and restored, enabling robust full cluster upgrades and restoration of state.

I would like to use etcd initially and it seems to be supported now. I understand that it would then be possible to have a statefulset with the control planes of the child clusters probably deployed in the same namespace. Correct ?
How is the HA/load balancing part managed? I am concerned about having a problem without an external load balancer at my management cluster level.

I would appreciate any other recommendations you have.

jnummelin · 2024-08-09T09:07:15Z

Is this resilient on “controller+worker” nodes?

Yes, combining NLLB and CBLB on the mgmt cluster makes then HA, from connectivity point of view.

Next, I wonder how this works for the control plane of the child clusters managed by k0smotron within the management cluster with the previous setup.

k0smotron child CPs are just workloads in the mgmt cluster. The child CPs do not really need any access to the mgmt cluster CP. Therefore the HA setup of the mgmt cluster does not really effect the child CPs directly. Of course if the mgmt CP goes 💥 you will lose controllability for the child CPs. I always say that child CPs are really just pods, they behave exactly the same as any other workload and thus all the same HA patterns apply.

I would like to use etcd initially and it seems to be supported now. I understand that it would then be possible to have a statefulset with the control planes of the child clusters probably deployed in the same namespace. Correct ?

The statefulset(s) of child CPs are deployed in the same namespace you create the K0smotronControlPlane objects in. The typical model is that each child cluster has their own namespace. This is especially handy when used with ClusterAPI as all the CAPI objects live in the same namespace too.

How is the HA/load balancing part managed? I am concerned about having a problem without an external load balancer at my management cluster level.

In this case the LB questions has two parts: for mgmt cluster and child clusters. As we covered, LB for the mgmt cluster can be done using CPLB feature of k0s. For child clusters k0smotron creates a LoadBalancer type service to expose the controlplane out of the mgmt cluster. Thus in this case you'd need some operator in the mgmt cluster that can expose these services. Since you're planning to use SBCs, kind of on-premise small DC, I'd suggest to look at things like MetalLB for that functionality.

tdesaules · 2024-08-09T09:32:20Z

Thank you for the response.

It’s pretty much what I was thinking.

Now, I need to experiment with it to get a better understanding.
I want to try using Cilium as the CNI and replace the kube-router and kube-proxy components. I’ll also take a look at the LoadBalancer aspect to avoid using MetalLB.

I think you can close this for now.
I’ll add more information as I go along.

jnummelin added the question Further information is requested label Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seeking Advice on Setting Up a High Availability K0s and K0smotron Lab #661

Seeking Advice on Setting Up a High Availability K0s and K0smotron Lab #661

tdesaules commented Aug 1, 2024

jnummelin commented Aug 9, 2024

tdesaules commented Aug 9, 2024

Seeking Advice on Setting Up a High Availability K0s and K0smotron Lab #661

Seeking Advice on Setting Up a High Availability K0s and K0smotron Lab #661

Comments

tdesaules commented Aug 1, 2024

jnummelin commented Aug 9, 2024

tdesaules commented Aug 9, 2024