-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loki Alerts are fired for each principal application instead only once #159
Comments
Multiple principles is not yet supported this way. Juju supports deploying multiple principals to the same machine (--to), but if we relate them to the same subordinate, or even if we deploy the same subordinate under two different names, the snap would overwrite/uninstall each other, as well as the grafana agent config file. Also, we're not interested in parallel installs, because we do want only one instance of grafana agent. In the current juju model for subordinates, we would need to have separate charms doing "delta charming" on the shared granfana-agent.yaml. And also figure out from the config file if we should install/uninstall the snap. |
Btw, does each principal come with their own rules? If not, then relating gagent to just one of the principals would be enough to get the node-exporter stuff. |
I think we are misunderstanding, the principals are all on different machines.
Yes, they have the same rules. But they all have to be integrated with grafana-agent. E.g. we have three charm deployments A,B,C, and they are all integrated with grafana-agent X. If we would only integrate A with grafana-agent X, then the logs and metrics for B and C would be missing. |
Could you give us a screenshot of Alertmanager showing the alerts firing with the labels expanded, and a screenshot of Prometheus showing the results of querying that metric name? |
Hi @lucabello , sure here are the screenshots. First three screenshots show the alert notifications in mattermost channel: You see that the alert is reported for the The log (which comes from Loki) is only issued once by the same grafana-agent, and the log contains the grafana-agent as So there is a drift between the value of |
Bug Description
We have following alert rule: https://github.com/canonical/github-runner-operator/blob/b70a5353deb280738339f5878e8fa57c45c3cc78/src/loki_alert_rules/failure.rules#L4-L11 to detect runner crashes for a particular self-hosted runner deployment.
This gets topology labels injected by the grafana agent
The Grafana agent is integrated with 3 main applications. Once an alert is triggered, it is duplicated to all 3 main applications (although it only applies to one). The alerts all contain
juju_application=principal-charm
instead ofjuju_application=grafana-agent
(as it is displayed in grafana and loki).It appears that the principal application name is used for the
juju_application
label in the fired alert instead of the subordinate one, which would lead to deduplication because the same alert is fired.To Reproduce
Create a simple alert rule for a charm. Deploy the charm three times, and integrate it with the grafana agent, which in turn should be integrated with loki, and loki with the alertmanager. Note that the alert is triggered for each main application, not just once for the grafana agent.
Environment
juju 3.1.8 , grafana-agent rev 164
Relevant log output
Additional context
No response
The text was updated successfully, but these errors were encountered: