Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K0smotron controller freezes for some time before continue reconciling #680

Closed
eromanova opened this issue Aug 13, 2024 · 1 comment · Fixed by #783
Closed

K0smotron controller freezes for some time before continue reconciling #680

eromanova opened this issue Aug 13, 2024 · 1 comment · Fixed by #783

Comments

@eromanova
Copy link

Environment

K0smotron 1.0.2
Managed cluster with k0s as a bootstrap provider and aws as an infrastructure provider using cluster API (simple setup for testing: 1 control plane, 1 worker)

Issue definition

Sometimes the k0smotron controller may hang for some time (usually after some error) and does not reconcile the resources while it should.
For example, the k0smotron controller failed to reconcile the dynamic config at 07:49 and the next log message appeared only at 07:59 (10 minutes standstill). At the same time, the dynamic config became ready almost immediately after the first failure.

This makes the cluster deployment take more time than it should and the k0smotron controller should retry earlier.

See logs:

2024-07-18T07:49:42Z  ERROR Failed to reconcile dynamic config  {"controller": "k0scontrollerconfig", "controllerGroup": "bootstrap.cluster.x-k8s.io", "controllerKind": "K0sControllerConfig", "K0sControllerConfig": {"name":"ekaz-dev-cp-0","namespace":"hmc-system"}, "namespace": "hmc-system", "name": "ekaz-dev-cp-0", "reconcileID": "e65fee16-0c4c-402b-967f-e62238688930", "K0sControllerConfig": {"name":"ekaz-dev-cp-0","namespace":"hmc-system"}, "kind": "Machine", "version": "17838", "name": "ekaz-dev-cp-0", "error": "failed to reconcile dynamic config, kubeconfig may not be available yet: failed to patch k0s config: failed to get API group resources: unable to retrieve the complete list of server APIs: k0s.k0sproject.io/v1beta1: client rate limiter Wait returned an error: context deadline exceeded - error from a previous attempt: EOF"}
github.com/k0sproject/k0smotron/internal/controller/bootstrap.(*ControlPlaneController).Reconcile
  /workspace/internal/controller/bootstrap/controlplane_bootstrap_controller.go:165
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
  /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
  /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
  /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
  /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227
2024-07-18T07:59:37Z  INFO  Reconciling K0sConfig {"controller": "k0sworkerconfig", "controllerGroup": "bootstrap.cluster.x-k8s.io", "controllerKind": "K0sWorkerConfig", "K0sWorkerConfig": {"name":"ekaz-dev-md-77z9b-6rpc2","namespace":"hmc-system"}, "namespace": "hmc-system", "name": "ekaz-dev-md-77z9b-6rpc2", "reconcileID": "3db68c05-836d-4dd2-b678-e0a4da4d4351", "k0sconfig": {"name":"ekaz-dev-md-77z9b-6rpc2","namespace":"hmc-system"}}
@Schnitzel
Copy link
Contributor

discussed in the internal k0smotron office hours:
this possible might be resolved with newer versions of k0smotron, let's test again and provide feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants