Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configures alertmanger for heartbeat with standalone only #63

Merged
merged 36 commits into from
Mar 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
b077c2c
[WIP] Adds heartbeat option to grafana cloud operator
John2020-cyber Feb 9, 2024
8c7012f
update
John2020-cyber Feb 9, 2024
46f378d
update
John2020-cyber Feb 9, 2024
f1fab6a
update
John2020-cyber Feb 9, 2024
9d15d84
update
John2020-cyber Feb 9, 2024
3832c24
update
John2020-cyber Feb 9, 2024
f1354f6
Adds prometheus rule for standalone mode
John2020-cyber Feb 12, 2024
3ed3cf0
Update
John2020-cyber Feb 12, 2024
28611dd
update
John2020-cyber Feb 12, 2024
4c033bf
update
John2020-cyber Feb 12, 2024
633fc95
update
John2020-cyber Feb 12, 2024
2bdfa78
update
John2020-cyber Feb 13, 2024
f145600
Fix
John2020-cyber Feb 13, 2024
8a56c00
update
John2020-cyber Feb 13, 2024
51470e5
Update syncset and standalone PromRules
John2020-cyber Feb 13, 2024
3737a33
Update
John2020-cyber Feb 13, 2024
41ef8eb
Update
John2020-cyber Feb 14, 2024
9b96339
Update ns for PromRule
John2020-cyber Feb 14, 2024
2692844
update SS
John2020-cyber Feb 14, 2024
82d6272
Bug fix for Syncset
John2020-cyber Feb 19, 2024
03a01c7
Update name and namespace for standalone
John2020-cyber Feb 19, 2024
3653421
Fix resource name
John2020-cyber Feb 19, 2024
e65abad
Fix register failure for syncset
John2020-cyber Feb 20, 2024
4953ca4
Fix for mapped integration
John2020-cyber Feb 20, 2024
cb72f67
Have one sycnset. Updates name for syncset
John2020-cyber Feb 20, 2024
e6ea85a
Use templates where we can
John2020-cyber Feb 20, 2024
b42e9c6
Update readme to support addition of new Heartbeat feature
John2020-cyber Mar 5, 2024
b121b75
Fix
John2020-cyber Mar 5, 2024
3146aaf
Update
John2020-cyber Mar 5, 2024
2c8aff3
Update for standalone mode only
John2020-cyber Mar 5, 2024
9638b59
update
John2020-cyber Mar 5, 2024
d4350f0
Update for spell fix
John2020-cyber Mar 5, 2024
818fd06
Update readme
John2020-cyber Mar 5, 2024
8d8e4d2
Spell fix
John2020-cyber Mar 5, 2024
8413ee9
Updates and fixes
John2020-cyber Mar 6, 2024
f716408
Update readme
John2020-cyber Mar 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ The operator's workflow can be described in two different architectural models:
ModSecret[Include: modify_alertmanager_secret]
Reencode[Re-encode Alertmanager Content]
PatchSecret[Patch alertmanager-main Secret]
AddPrometheusRule[Add PrometheusRule]
UpdateCR[Update CR Status to ConfigUpdated]
Init --> GetClusterName
GetClusterName --> CheckIntegration
Expand All @@ -129,8 +130,8 @@ The operator's workflow can be described in two different architectural models:
GO -->|Return: Endpoint| ConfigureSlack
ConfigureSlack --> ModSecret
ModSecret --> PatchSecret
PatchSecret --> UpdateCR
CheckIntegration -- Integration exists --> UpdateCR
PatchSecret --> AddPrometheusRule
AddPrometheusRule --> UpdateCR
```

*Operator Workflow in Standalone Cluster:*
Expand All @@ -142,7 +143,7 @@ The operator's workflow can be described in two different architectural models:

*In-Cluster Configuration Management:*
The operator directly applies configuration changes within the cluster, bypassing the need for `Syncsets`.
It ensures the Alertmanager's alert forwarding settings are correctly configured for seamless communication with Grafana On Call.
It ensures the Alertmanager's alert forwarding settings are correctly configured for seamless communication with Grafana On Call. Additionally, it adds option for On call Heartbeat which acts as a monitoring for monitoring systems. It also creates PrometheusRule that adds a Vector as heartbeat generator.

*Local Secret Management:*
Managing the `alertmanager-main-generated` secret locally, the operator updates its configurations.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,12 @@ spec:
- patch
- update
- watch
- apiGroups:
- monitoring.coreos.com
resources:
- prometheusrules
verbs:
- '*'
- apiGroups:
- ""
resources:
Expand Down
3 changes: 3 additions & 0 deletions charts/grafana-oncall/templates/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,9 @@ rules:
- apiGroups: ["slack.stakater.com"]
resources: ["channels"]
verbs: ["get", "list", "watch"]
- apiGroups: ["monitoring.coreos.com"]
resources: ["prometheusrules"]
verbs: ["*"]
- apiGroups: ["hive.openshift.io"]
resources: ["syncsets"]
verbs: ["*"]
Expand Down
3 changes: 3 additions & 0 deletions config/rbac/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,9 @@ rules:
- apiGroups: ["slack.stakater.com"]
resources: ["channels"]
verbs: ["create", "get", "list", "patch", "update", "watch"]
- apiGroups: ["monitoring.coreos.com"]
resources: ["prometheusrules"]
verbs: ["*"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["*"]
Expand Down
22 changes: 22 additions & 0 deletions roles/grafana_cloud_operator/tasks/grafana_oncall_standalone.yml
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,28 @@
alertmanager.yaml: "{{ encoded_alertmanager_secret_content }}"
when: not integration_exists_for_cluster

- name: Add prometheus rule for cluster
kubernetes.core.k8s:
state: present
namespace: "openshift-monitoring"
definition:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: "heartbeat-grafana-oncall"
namespace: "openshift-monitoring"
spec:
groups:
- name: meta
rules:
- alert: heartbeat
annotations:
description: This is a heartbeat alert for Grafana OnCall
summary: Heartbeat for Grafana OnCall
expr: vector(1)
labels:
severity: none

- name: Update CR status to ConfigUpdated
kubernetes.core.k8s:
state: present
Expand Down
28 changes: 24 additions & 4 deletions roles/grafana_cloud_operator/tasks/modify_alertmanager_secret.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,21 @@
{{ (fetched_alertmanager_secret.resources[0].data['alertmanager.yaml'] | b64decode | from_yaml) | combine({
'receivers': [
{
'name': receiver_name,
'webhook_configs': [{
'url': receiver_url
}]
"name": receiver_name,
"webhook_configs": [
{
"url": receiver_url
}
]
},
{
"name": "grafana-oncall-heartbeat",
"webhook_configs": [
{
"url": receiver_url + "heartbeat/",
"send_resolved": false
}
]
}
],
'route': {
Expand All @@ -18,6 +29,15 @@
'match': {
'severity': 'info | warning | critical'
}
},
{
"match": {
"alertname": "heartbeat"
},
"receiver": "grafana-oncall-heartbeat",
"group_wait": "0s",
"group_interval": "1m",
"repeat_interval": "50s"
}
]
}
Expand Down
Loading