Webhook pods are restarting with probes failure #1159

Skyere · 2023-10-30T05:58:59Z

Describe the bug
When under the load, webhook pods are restarting with

  Warning  Unhealthy  23m (x2 over 26m)   kubelet            Readiness probe failed: Get "http://10.160.7.232:9440/readyz": dial tcp 10.160.7.232:9440: connect: connection refused
  Warning  Unhealthy  14m (x6 over 26m)   kubelet            Readiness probe failed: Get "http://10.160.7.232:9440/readyz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  BackOff    14m (x10 over 26m)  kubelet            Back-off restarting failed container manager in pod azure-wi-webhook-controller-manager-586dc676d-scgpn_kube-system(0e0b518e-530c-43cc-999b-303672b7628d)

There is no issues with a few pods or in waiting time
Steps To Reproduce
Start creating like 10+ at the same time
Expected behavior
Webhook pods are not restarting due to probes failure
Logs
I was able only once to get logs from a contianer

{"level":"debug","timestamp":"2023-10-29T10:58:28.992365Z","logger":"controller-runtime.webhook.webhooks","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/webhook/admission/http.go:96$admission.(*Webhook).ServeHTTP","message":"received request","webhook":"/mutate-v1-pod","UID":"c6e290a4-f532-425a-b0dd-455063c027c8","kind":"/v1, Kind=Pod","resource":{"group":"","version":"v1","resource":"pods"}}
{"level":"debug","timestamp":"2023-10-29T10:58:48.902144Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:60$healthz.(*Handler).serveAggregated","message":"healthz check failed","checker":"readyz","error":"webhook server is not reachable: context deadline exceeded"}
{"level":"info","timestamp":"2023-10-29T10:59:05.496663Z","caller":"/usr/local/go/src/log/log.go:194$log.(*Logger).Output","message":"http: TLS handshake error from 127.0.0.1:54936: EOF"}
{"level":"debug","timestamp":"2023-10-29T10:59:07.900988Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:60$healthz.(*Handler).serveAggregated","message":"healthz check failed","checker":"readyz","error":"webhook server is not reachable: context deadline exceeded"}
{"level":"info","timestamp":"2023-10-29T10:59:01.001668Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:128$healthz.writeStatusesAsText","message":"healthz check failed","statuses":[{}]}
{"level":"info","timestamp":"2023-10-29T10:59:26.902616Z","caller":"/usr/local/go/src/log/log.go:194$log.(*Logger).Output","message":"http: TLS handshake error from 127.0.0.1:54342: read tcp 127.0.0.1:9443->127.0.0.1:54342: i/o timeout"}
{"level":"debug","timestamp":"2023-10-29T10:59:27.003707Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:60$healthz.(*Handler).serveAggregated","message":"healthz check failed","checker":"readyz","error":"webhook server is not reachable: dial tcp :9443: i/o timeout"}
{"level":"info","timestamp":"2023-10-29T10:59:31.905952Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:128$healthz.writeStatusesAsText","message":"healthz check failed","statuses":[{}]}
{"level":"debug","timestamp":"2023-10-29T10:59:38.999699Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:60$healthz.(*Handler).serveAggregated","message":"healthz check failed","checker":"readyz","error":"webhook server is not reachable: dial tcp :9443: i/o timeout"}
{"level":"info","timestamp":"2023-10-29T10:59:47.198457Z","caller":"/usr/local/go/src/log/log.go:194$log.(*Logger).Output","message":"http: TLS handshake error from 127.0.0.1:44212: EOF"}
{"level":"info","timestamp":"2023-10-29T10:59:57.994511Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:128$healthz.writeStatusesAsText","message":"healthz check failed","statuses":[{}]}
{"level":"info","timestamp":"2023-10-29T10:59:49.392236Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:128$healthz.writeStatusesAsText","message":"healthz check failed","statuses":[{}]}
{"level":"debug","timestamp":"2023-10-29T11:00:11.911428Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:60$healthz.(*Handler).serveAggregated","message":"healthz check failed","checker":"readyz","error":"webhook server is not reachable: context deadline exceeded"}
{"level":"info","timestamp":"2023-10-29T11:00:16.896439Z","caller":"/usr/local/go/src/log/log.go:194$log.(*Logger).Output","message":"http: TLS handshake error from 127.0.0.1:49882: EOF"}
{"level":"debug","timestamp":"2023-10-29T11:00:29.811357Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:60$healthz.(*Handler).serveAggregated","message":"healthz check failed","checker":"readyz","error":"webhook server is not reachable: dial tcp :9443: i/o timeout"}
{"level":"info","timestamp":"2023-10-29T11:00:38.304028Z","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:581$manager.(*controllerManager).engageStopProcedure.func3","message":"Stopping and waiting for non leader election runnables"}
{"level":"info","timestamp":"2023-10-29T11:00:34.904461Z","logger":"controller-runtime.healthz","caller":"/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/healthz/healthz.go:128$healthz.writeStatusesAsText","message":"healthz check failed","statuses":[{}]}

Environment

Kubernetes version (use kubectl version): v1.27.3
Cloud provider or hardware configuration: AKS

The text was updated successfully, but these errors were encountered:

Skyere added the bug Something isn't working label Oct 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Webhook pods are restarting with probes failure #1159

Webhook pods are restarting with probes failure #1159

Skyere commented Oct 30, 2023

Webhook pods are restarting with probes failure #1159

Webhook pods are restarting with probes failure #1159

Comments

Skyere commented Oct 30, 2023