Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same alert rules are created for LXDs and metals #32

Open
facundofc opened this issue Nov 10, 2023 · 4 comments
Open

Same alert rules are created for LXDs and metals #32

facundofc opened this issue Nov 10, 2023 · 4 comments

Comments

@facundofc
Copy link

Bug Description

If the same grafana-agent application is related with two principals coexisting in the same metal, one metal deployed and one lxd deployed, the same set of rules is created in prometheus. For some (all?) rules this is not desirable because the metrics will be exactly the same for the metal and for the lxd. One example of such a metric is node_cpu_seconds_total.

The problem with this is that several alerts will fire due to the exact same issue: an overloaded host.

To Reproduce

juju deploy ubuntu
juju deploy ubuntu ubuntu-lxd --to lxd:0
juju deploy grafana-agent
juju relate ubuntu grafana-agent
juju relate ubuntu-lxd grafana-agent

Environment

This was observed in latest/edge, revision 16.

Relevant log output

n/a

Additional context

No response

@przemeklal
Copy link
Member

One possible workaround is to silence them forever, using these matchers for example:

job=~".*grafana-agent-container.*"
alertname!~"HostInterfaceMTUSize"

In the above example, only HostInterfaceMTUSize alerts will fire, everything else coming from g-agent deployed as grafana-agent-container will be silenced.

@lucabello
Copy link
Contributor

lucabello commented Jan 4, 2024

@dstathis want to take a look? Is this related to recent work?

(sidenote: this was observed in rev16; we are currently at rev29)

@dstathis
Copy link
Contributor

One possible way to solve this could be to have a config variable that disables node_exporter metrics. It would require you to deploy 2 different grafana-agent applications, one for lxd and one for the guest machines. Would that work for you?

@przemeklal
Copy link
Member

@dstathis This could work as we usually deploy two or more different grafana-agent applications already (so that hardware-observer is related only to the one running on physical machines for example).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants