Skip to content

Commit

Permalink
[RFC-0008] Custom Event Metadata from Annotations
Browse files Browse the repository at this point in the history
Signed-off-by: Matheus Pimenta <[email protected]>
  • Loading branch information
matheuscscp committed Dec 25, 2024
1 parent 8b1d9a1 commit 311fa14
Show file tree
Hide file tree
Showing 5 changed files with 277 additions and 74 deletions.
92 changes: 80 additions & 12 deletions docs/spec/v1beta3/alerts.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,13 @@ metadata:
name: slack
namespace: flux-system
spec:
summary: "Cluster addons impacted in us-east-2"
summary: Cluster addons impacted
providerRef:
name: slack-bot
eventMetadata:
env: prod
cluster: my-cluster
region: us-east-2
eventSeverity: error
eventSources:
- kind: GitRepository
Expand All @@ -51,7 +55,7 @@ In the above example:
all GitRepositories and Kustomizations in the `flux-system` namespace.
- When an event with severity `error` is received, the controller posts
a message on Slack channel from `.spec.channel`,
containing the `summary` text and the reconciliation error.
containing the `summary` text, metadata and the reconciliation error.

You can run this example by saving the manifests into `slack-alerts.yaml`.

Expand All @@ -78,10 +82,14 @@ An Alert also needs a

### Summary

`.spec.summary` is an optional field to specify a short description of the
impact and affected cluster.
`.spec.summary` is an optional field to specify a short description of the impact.

The summary max length can't be greater than 255 characters.

The summary max length can't be greater than 255 characters.
**Warning:** Support for `.spec.summary` has been deprecated and will be removed in
Alert API v1 GA. If you have any Alerts using this field, the controller will log a
deprecation warning. Please use [object annotations](#event-metadata-from-object-annotations)
for defining alert summary instead.

### Provider reference

Expand Down Expand Up @@ -146,10 +154,11 @@ preventing tenants from subscribing to another tenant's events.
### Event metadata

`.spec.eventMetadata` is an optional field for adding metadata to events dispatched by
the controller. This can be used for enhancing the context of the event. If a field
would override one already present on the original event as generated by the emitter,
then the override doesn't happen, i.e. the original value is preserved, and an info
log is printed.
the controller. This can be used for enhancing the context of the event, e.g. with
cluster-level information.

For all the event metadata sources and their precedence order, please refer to
[Event metadata from object annotations](#event-metadata-from-object-annotations).

#### Example

Expand All @@ -168,9 +177,68 @@ spec:
inclusionList:
- ".*succeeded.*"
eventMetadata:
app.kubernetes.io/env: "production"
app.kubernetes.io/cluster: "my-cluster"
app.kubernetes.io/region: "us-east-1"
env: production
cluster: my-cluster
region: us-east-1
```

### Event metadata from object annotations

Event metadata has four sources. They are listed below in order of precedence,
from lowest to highest:

1. User-defined metadata on Flux objects, set with the `event.toolkit.fluxcd.io/`
prefix in the keys of the object's `.metadata.annotations`.
2. User-defined metadata on the Alert object, set with [`.spec.eventMetadata`](#event-metadata).
3. User-defined summary on the Alert object, set with [`.spec.summary`](#summary) (deprecated, see docs).
4. Controller-defined metadata, set with the `<controller group>.toolkit.fluxcd.io/`
prefix in the metadata keys of the event payload.

If there are any metadata key conflicts between the sources, the higher
precedence source will override the lower precedence source, and a warning
log and Kubernetes event will be emitted.

#### Example

```yaml
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: <name>
spec:
eventSources:
- kind: HelmRelease
name: '*'
eventMetadata:
env: production
cluster: my-cluster
region: us-east-1
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: my-webapp
annotations:
event.toolkit.fluxcd.io/summary: "my-webapp impacted. Playbook: <URL to playbook>"
event.toolkit.fluxcd.io/deploymentID: e076e315-5a48-41c3-81c8-8d8bdee7d74d
spec:
... # fields omitted for brevity
```

In the above example, the event payload dispatched by the controller will look like this
(most fields omitted for highlighting the metadata):

```json
{
"metadata": {
"env": "production",
"cluster": "my-cluster",
"region": "us-east-1",
"summary": "my-webapp impacted. Playbook: <URL to playbook>",
"deploymentID": "e076e315-5a48-41c3-81c8-8d8bdee7d74d"
}
}
```

### Event severity
Expand Down
101 changes: 82 additions & 19 deletions internal/server/event_handlers.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"net/http"
"net/url"
"regexp"
"slices"
"strings"
"time"

Expand Down Expand Up @@ -256,7 +257,7 @@ func (s *EventServer) getNotificationParams(ctx context.Context, event *eventv1.
}

notification := *event.DeepCopy()
s.enhanceEventWithAlertMetadata(ctx, &notification, alert)
s.combineEventMetadata(ctx, &notification, alert)

return sender, &notification, token, provider.GetTimeout(), nil
}
Expand Down Expand Up @@ -418,30 +419,90 @@ func (s *EventServer) eventMatchesAlertSource(ctx context.Context, event *eventv
return sel.Matches(labels.Set(obj.GetLabels()))
}

// enhanceEventWithAlertMetadata enhances the event with Alert metadata.
func (s *EventServer) enhanceEventWithAlertMetadata(ctx context.Context, event *eventv1.Event, alert *apiv1beta3.Alert) {
meta := event.Metadata
if meta == nil {
meta = make(map[string]string)
// combineEventMetadata combines all the sources of metadata for the event
// according to the precedence order defined in RFC 0008. From lowest to
// highest precedence, the sources are:
//
// 1) Event metadata keys prefixed with the Event API Group stripped of the prefix.
//
// 2) Alert .spec.eventMetadata with the keys as they are.
//
// 3) Alert .spec.summary with the key "summary".
//
// 4) Event metadata keys prefixed with the involved object's API Group stripped of the prefix.
//
// At the end of the process key conflicts are detected and a single
// info-level log is emitted to warn users about all the conflicts,
// but only if at least one conflict is found.
func (s *EventServer) combineEventMetadata(ctx context.Context, event *eventv1.Event, alert *apiv1beta3.Alert) {
const (
sourceEventGroup = "involved object annotations"
sourceAlertEventMetadata = "Alert object .spec.eventMetadata"
sourceAlertSummary = "Alert object .spec.summary"
sourceObjectGroup = "involved object controller metadata"

summaryKey = "summary"
)

l := log.FromContext(ctx)
metadata := make(map[string]string)
metadataSources := make(map[string][]string)

// 1) Event metadata keys prefixed with the Event API Group stripped of the prefix.
eventGroupPrefix := "event.toolkit.fluxcd.io/" // TODO: use constant from github.com/fluxcd/pkg/apis/event when available
for k, v := range event.Metadata {
if strings.HasPrefix(k, eventGroupPrefix) {
key := strings.TrimPrefix(k, eventGroupPrefix)
metadata[key] = v
metadataSources[key] = append(metadataSources[key], sourceEventGroup)
}
}

for key, value := range alert.Spec.EventMetadata {
if _, alreadyPresent := meta[key]; !alreadyPresent {
meta[key] = value
} else {
log.FromContext(ctx).
Info("metadata key found in the existing set of metadata", "key", key)
s.Eventf(alert, corev1.EventTypeWarning, "MetadataAppendFailed",
"metadata key found in the existing set of metadata for '%s' in %s", key, involvedObjectString(event.InvolvedObject))
}
// 2) Alert .spec.eventMetadata with the keys as they are.
for k, v := range alert.Spec.EventMetadata {
metadata[k] = v
metadataSources[k] = append(metadataSources[k], sourceAlertEventMetadata)
}

// 3) Alert .spec.summary with the key "summary".
if alert.Spec.Summary != "" {
meta["summary"] = alert.Spec.Summary
metadata[summaryKey] = alert.Spec.Summary
metadataSources[summaryKey] = append(metadataSources[summaryKey], sourceAlertSummary)
l.Info("warning: specifying alert summary cert via '.spec.summary' is deprecated, please use '.spec.eventMetadata.summary' instead")
}

// 4) Event metadata keys prefixed with the involved object's API Group stripped of the prefix.
objectGroupPrefix := event.InvolvedObject.GroupVersionKind().Group + "/"
for k, v := range event.Metadata {
if strings.HasPrefix(k, objectGroupPrefix) {
key := strings.TrimPrefix(k, objectGroupPrefix)
metadata[key] = v
metadataSources[key] = append(metadataSources[key], sourceObjectGroup)
}
}

// Detect key conflicts and emit warnings if any.
type keyConflict struct {
Key string `json:"key"`
Sources []string `json:"sources"`
}
var conflictingKeys []*keyConflict
conflictEventAnnotations := make(map[string]string)
for key, sources := range metadataSources {
if len(sources) > 1 {
conflictingKeys = append(conflictingKeys, &keyConflict{key, sources})
conflictEventAnnotations[key] = strings.Join(sources, ", ")
}
}
if len(conflictingKeys) > 0 {
const msg = "metadata key conflicts detected (please refer to the Alert API docs and Flux RFC 0008 for more information)"
slices.SortFunc(conflictingKeys, func(a, b *keyConflict) int { return strings.Compare(a.Key, b.Key) })
l.Info("warning: "+msg, "conflictingKeys", conflictingKeys)
s.AnnotatedEventf(alert, conflictEventAnnotations, corev1.EventTypeWarning, "MetadataAppendFailed", "%s", msg)
}

if len(meta) > 0 {
event.Metadata = meta
if len(metadata) > 0 {
event.Metadata = metadata
}
}

Expand All @@ -450,7 +511,9 @@ func excludeInternalMetadata(event *eventv1.Event) {
if len(event.Metadata) == 0 {
return
}
excludeList := []string{eventv1.MetaTokenKey}
objectGroup := event.InvolvedObject.GetObjectKind().GroupVersionKind().Group
tokenKey := fmt.Sprintf("%s/%s", objectGroup, eventv1.MetaTokenKey)
excludeList := []string{tokenKey}
for _, key := range excludeList {
delete(event.Metadata, key)
}
Expand Down
Loading

0 comments on commit 311fa14

Please sign in to comment.