-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus: ingress_controller_configuration_push_count
should tell between intermittent errors and lock-ups
#2484
Comments
ingress_controller_configuration_push_count
should tell between intermittent errors and lock-upsingress_controller_configuration_push_count
should tell between intermittent errors and configuration errors
ingress_controller_configuration_push_count
should tell between intermittent errors and configuration errorsingress_controller_configuration_push_count
should tell between intermittent errors and lock-ups
I changed back the milestone to KIC v2.7.0 because the functional changes managed to get into this release. |
Grafana dashboard got updated in the KIC's repository https://github.com/Kong/kubernetes-ingress-controller/blob/main/grafana.json, but I'm waiting for Grafana (the company) to respond to my question regarding problems with the dashboard visibility on their catalog. |
So, what's left to close the issue? We should close KIC 2.7 milestone soon. |
We're missing a review on Kong/docs.konghq.com#4500. The dashboard JSON in our repository has been updated so I'd consider the acceptance criteria for it complete. Regarding the dashboard on grafana.com, I think we can take it separately. I've created an issue to track this: #2991 |
Is there an existing issue for this?
Problem Statement
@seh reported that they'd like to use the
ingress_controller_configuration_push_count
Prometheus metric to alert in case a configuration lock-up (see #2195) happens, but that metric today does not tell between transient errors (e.g. network disconnect) and those that require fixing a config conflict (e.g. conflicting consumers #2324, #680)Proposed Solution
Add a metric label on the
ingress_controller_configuration_push_count
metric telling between failures that require fixing a conflict and those not requiring fixing a conflict.Additional information
Note that the existing metric
ingress_controller_translation_count[success=true|false]
may be answering this question already.Acceptance Criteria
ingress_controller_translation_count[success=true|false]
where true is successfully pushed configs while false is failed pushesNo response
The text was updated successfully, but these errors were encountered: