Skip to content

Commit

Permalink
Pre commit fixes
Browse files Browse the repository at this point in the history
Signed-off-by: Lalith Kota <[email protected]>
  • Loading branch information
lalithkota committed Dec 16, 2024
1 parent 7eef49a commit 53adf66
Show file tree
Hide file tree
Showing 9 changed files with 13 additions and 18 deletions.
14 changes: 6 additions & 8 deletions alerting/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,28 +9,26 @@ Prometheus, Grafana, and alert manager tools are needed to set up alert notifica

## Slack alerts notification

- Creating slack incoming webhook [here](https://api.slack.com/messaging/webhooks),
- update ``slack_api_url`` and ``channel`` in ``alertmanager.yaml``
- Creating slack incoming webhook [here](https://api.slack.com/messaging/webhooks),
- update ``slack_api_url`` and ``channel`` in ``alertmanager.yaml``
- run ``./install.sh `` to patch alertmanager.

## Creating custom alerts

The monitoring package provided by rancher has various default alerting rules, most of the time the default rules are enough. Sample custom alerts are provided under ``custom-alerts``. Modify the same and apply using ``kubectl``
The monitoring package provided by rancher has various default alerting rules, most of the time the default rules are enough. Sample custom alerts are provided under ``custom-alerts``. Modify the same and apply using ``kubectl``

## Silence/Mute alerts

- Go to alertmanager under monitoring tab in rancher ui
- Click on the alert -> silence, add appropriate silence duration, creator, and comment
- Go to alertmanager under monitoring tab in rancher ui
- Click on the alert -> silence, add appropriate silence duration, creator, and comment

![](_img/mute-alerts.png)

## Add cluster name to the alert

When having multiple clusters, you can add a cluster name to be presented as part of alert information. Here our cluster name is soil.
When having multiple clusters, you can add a cluster name to be presented as part of alert information. Here our cluster name is soil.

![](_img/sample-notification.png)

- Add cluster name in, ``patch-cluster-name.yaml``
- Run ``install.sh``


2 changes: 1 addition & 1 deletion alerting/alertmanager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ route:
alertname: KubePersistentVolumeFilling-greater-than-90%
receiver: 'slack'
- match:
alertname: AggregatedAPIDown
alertname: AggregatedAPIDown
receiver: 'null'
- match:
alertname: KubeClientErrors
Expand Down
1 change: 0 additions & 1 deletion alerting/custom-alerts/kubernetes-node-not-ready.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,3 @@ spec:
for: 5m
labels:
severity: critical

2 changes: 1 addition & 1 deletion alerting/custom-alerts/node-avarage-cpu-load-high.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ spec:
- alert: NodeAverageCPULoadHigh
annotations:
description: >-
Node {{ $labels.instance }} has a 5-minute load average
Node {{ $labels.instance }} has a 5-minute load average
higher than the number of CPUs for more than 5 minutes.
runbook_url: >-
https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodeaveragecpuloadhigh
Expand Down
3 changes: 1 addition & 2 deletions alerting/custom-alerts/node-disk-usage-high.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ spec:
- alert: NodeStorageUsageHigh
annotations:
description: >-
Node {{ $labels.instance }} storage usage is above 90%.
Node {{ $labels.instance }} storage usage is above 90%.
Current usage: {{ $value | humanizePercentage }}.
runbook_url: >-
https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodestorageusagehigh
Expand All @@ -22,4 +22,3 @@ spec:
for: 30s
labels:
severity: critical

1 change: 0 additions & 1 deletion alerting/custom-alerts/node-memory-usage-high.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,3 @@ spec:
for: 300s
labels:
severity: critical

2 changes: 1 addition & 1 deletion alerting/custom-alerts/persistent-volume-usage-high.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ spec:
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefillingup
summary: PersistentVolume filled up to 90% of the storage.
expr: >-
kubelet_volume_stats_used_bytes{job="kubelet",metrics_path="/metrics",namespace=~".*"} /
kubelet_volume_stats_used_bytes{job="kubelet",metrics_path="/metrics",namespace=~".*"} /
kubelet_volume_stats_capacity_bytes{job="kubelet",metrics_path="/metrics",namespace=~".*"} > 0.90
for: 300s
labels:
Expand Down
2 changes: 1 addition & 1 deletion alerting/delete.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,4 @@ set -o errexit ## set -e : exit the script if any statement returns a non-true
set -o nounset ## set -u : exit the script if you try to use an uninitialised variable
set -o errtrace # trace ERR through 'time command' and other functions
set -o pipefail # trace ERR through pipes
installing_alerting # calling function
installing_alerting # calling function
4 changes: 2 additions & 2 deletions alerting/install.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
# Patch notification alerts
# Patch notification alerts

NS=cattle-monitoring-system

Expand All @@ -21,4 +21,4 @@ set -o errexit ## set -e : exit the script if any statement returns a non-true
set -o nounset ## set -u : exit the script if you try to use an uninitialised variable
set -o errtrace # trace ERR through 'time command' and other functions
set -o pipefail # trace ERR through pipes
installing_alerting # calling function
installing_alerting # calling function

0 comments on commit 53adf66

Please sign in to comment.