Replies: 2 comments
-
You've checked all the certs that K3s manages, and they look good. The apiserver exposes metrics for certificate expiry on all clients that authenticate against the apiserver, unfortunately that metric just indicates how much longer the cert is valid for. I suspect that's what the alert is firing off of, but I don't know since this isn't something that's part of K3s itself. You'd need to figure out what metric it's looking at, and then figure out what specific cert is triggering that. Could be a pod or something else. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the help. There was another cert expiry on 27th Oct on the 4th node, and after restarting k3s service on that node, alerts cleared out. |
Beta Was this translation helpful? Give feedback.
-
Environmental Info:
K3s Version: v1.25.4+k3s1
Node(s) CPU architecture, OS, and Version: Linux aokn-nlam-003 5.15.0-5.76.5.1.el9uek.x86_64 #2 SMP Fri Dec 9 18:37:36 PST 2022 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration: 5 servers (no agent is running on any of them)
Describe the bug:
Recently, Grafana started sending out alerts that "A client certificate used to authenticate to kubernetes apiserver (on 10250 port) is expiring in less than 14 days" and says a client cert on all 5 nodes are expiring soon, which is not true imo.
I've checked certificates on all 5 nodes with below command and the result shows the certificates are expiring next year.
If I understood correctly, /var/lib/rancher/k3s/server/tls/ and /var/lib/rancher/k3s/agent/ are the only paths where certificates are stored right? Probably it could be that Grafana query used for detecting cert expiration works incorrectly.
Grafana query :
apiserver_client_certificate_expiration_seconds_count{job="kubelet"} > 0 and on (job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="kubelet"}[5m]))) < 1209600
Also I restarted k3s service on one of the server 2 days ago, if that expiry alert was true, this node should have disappeared from alert, but still showing there. Could you please advise if I need to any other certs for expiration? Also how do you guys monitor cert expiry via Grafana?
Steps To Reproduce:
Expected behavior:
Actual behavior:
Additional context / logs:
Beta Was this translation helpful? Give feedback.
All reactions