Skip to content

WIP [main] Implement alerting #769

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Conversation

jbiers
Copy link
Member

@jbiers jbiers commented May 22, 2025

Issue: #672

@jbiers jbiers requested a review from a team as a code owner May 22, 2025 19:56
namespace: {{ .Release.Namespace }}
spec:
groups:
- name: backup-restore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to support interval or limit?

## Define custom alerting rules here.
## The "BackupFailed" alert is included by default when .Values.monitoring.alerts.enabled is set to true and rancher-monitoring is installed.
alertingRules:
- alert: BackupFailed
Copy link
Member

@mallardduck mallardduck May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Only thing I'm wondering about is, "are there any alerts that BRO should provide by default that cannot be disabled". In other words, are there any features of this new monitoring/alerting area of BRO that (when enabled) are required? I guess this thought goes just as much for these alerting rules as it does for any metrics collection things too - since I could imagine this factor relates more to dashboards than alerts TBH. 🤔


## Define custom alerting rules here.
## The "BackupFailed" alert is included by default when .Values.monitoring.alerts.enabled is set to true and rancher-monitoring is installed.
alertingRules:
Copy link
Contributor

@alexandreLamarre alexandreLamarre May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also suggest putting this default alert rule definition in helm template in something like : templates/default-alerting-rules.yaml behind a template flag like .Values.useDefaultAlerts or something and the current way it works could be behind something like .Values.additionalRules

Edit: if we do put the default alerting rules behind that flag, make sure to pass in some form : .defaultAlerts.AdditionalLabels to the alerting rule, since AlertManager uses label matchers to send stuff to specific remote integrations, so users will want to pass in their own labels

@jbiers jbiers changed the title [main] Implement alerting WIP [main] Implement alerting May 26, 2025
@jbiers jbiers marked this pull request as draft May 26, 2025 19:17
@jbiers jbiers force-pushed the bro-alerting branch 4 times, most recently from cbc9ab3 to 309f039 Compare May 26, 2025 19:53
Copy link
Contributor

@alexandreLamarre alexandreLamarre May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbiers jbiers force-pushed the bro-alerting branch 2 times, most recently from cd097df to 4294661 Compare May 28, 2025 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants