-
Notifications
You must be signed in to change notification settings - Fork 145
Alerts
Argus evaluates alerts on metric data and notifies users when trigger thresholds are exceeded. Alerts are scheduled and executed per the CRON entry specified when the alert was created. An alert is associated with at least one trigger and one notification. You can associate a trigger with more than one notification, and a notification with more than one trigger.
You can access the Alert interface via the link at the top of the Argus interface. See Alerting Examples for instructions on creating alerts.
An alert refers to the configuration, including metadata, triggering conditions, and notification types that informs you when something interesting happens. The following components define certain overall aspects of the alert's identity and functioning.
Component | Description |
---|---|
CRON entry | Time and frequency of the alert (the alert's schedule), based on the Quartz CronTrigger format |
Enabled | Enabled for evaluation |
Expression | Metric expression used to retrieve the with which to evaluate trigger conditions |
Missing data notification | Notify owner if metric expression doesn't contain data |
Name | Alert name. Variable interpolation is available. |
Owner | Owner of the alert |
Shared | A boolean variable that indicates if the alert is visible by other users |
A trigger defines a threshold value as a condition. When the trigger condition is met, a notification is sent. An alert can have multiple triggers, and you can associate a trigger with more than one notification.
Component | Description |
---|---|
ID | ID of the trigger |
Inertia | The timespan for which the condition must be met before the trigger is fired. At least 2 qualifying data points must occur within the inertia window to fire the trigger. |
Name | Trigger name. Variable interpolation is available. |
Primary threshold | Operand for the comparison conditions |
Secondary threshold | Operand for the BETWEEN and NOT_BETWEEN operators. BETWEEN is inclusive of both operand values, while NOT_BETWEEN is exclusive of those values. |
Type | Trigger comparison operator. Choose "no data" to trigger on missing data and be able to use configured notifications (normally missing-data alerts go only to the alert owner). The threshold fields are not used with the "no data" type. |
A notification defines how you are notified when a trigger fires. You can associate a notification with more than one trigger.
Component | Description |
---|---|
Snooze (Cooldown period) | Timespan when no further notifications are sent. The notification enters this cooldown timespan immediately after a notification is sent. Notifications are sent again once the timespan expires. |
Triggers | The triggers to associate with the notification. |
Metrics to Annotate | Metric identifiers provided as a comma-separated list of metric expressions. When a notification is sent for an alert, an annotation is created on the specified metric. You can retrieve annotations via a RESTful endpoint or view them on a dashboard. The format to specify the metric expression is <scope>:<metricName>:<aggregator> . "avg" is the default for the aggregator field. |
Name | Notification name. Variable interpolation is available. |
Type | Notification type (see below) |
Custom Text | Free-text field. Variable interpolation is available. |
Argus supports the following notifiers:
- Audit Notifier: This notifier writes the notification to the Argus database. Subscriptions field is left blank.
- Email Notifier: This notifier sends an email to subscribers whose email addresses are mentioned in the Subscriptions field. The subscription field contains a comma-separated list of email addresses.
- Salesforce Chatter Notifier: This notifier sends an alert to a Chatter group or a list of Chatter groups as specified in the subscriptions field. The subscription field contains a comma-separated list of Chatter group ids.
- Callback Notifier: This notifier send the templated notification to HTTP end point!
When an alert is triggered, a notification is delivered (according to the selected notifier) and also logged in the alert History tab. The notification shows the metric expression associated with the alert, along with a link to the triggering event. Even if the original metric expression used relative time, the logged metric expression contains the actual timespan for the triggering event.
Argus uses FreeMarker v2.3.28 internally as the templating engine. Interpolated variables include the following kinds:
-
${scope}
— The expression’s scope -
${metric}
— The expression’s metric -
${tag.<tag_name>}
— Any tags used in the expression. If the <tag_name> contains a dash, it will have to be escaped by preceding the dash with a backslash so it will not be interpreted as a minus sign. Example: ${tag.foo-yoyo} (for "foo-yoyo" tag) -
${device}
— Any device tags used in the expression (deprecated; use${tag.device}
instead) -
${alert.name}
— Name of the alert -
${alert.expression}
- The expression associated with the alert -
${alert.cronEntry}
- The cron entry associated with the alert -
${alert.enabled}
- Boolean value which tells if alert is enabled or not -
${trigger.name}
- Name of the trigger -
${trigger.type}
- Type of trigger (for example: >, <, nodata) -
${trigger.threshold}
- Primary Threshold of the trigger -
${trigger.secondaryThreshold}
- Secondary Threshold of the trigger (if it exists) -
${trigger.inertia}
- Inertia of the Trigger -
${triggerValue}
- Value at which the trigger was fired -
${triggerTimestamp}
- Timestamp at which the trigger was fired. Default Format:MMM d, yyyy h:mm:ss a
-
${notification.name}
- Name of the notification -
${notification.cooldownPeriod}
- cooldownPeriod of the notification in milliseconds -
${notification.SRActionable}
- Boolean to state if SR should be actionable or not -
${notification.severityLevel}
- Severity level of the trigger
For example, if your expression is
-1h:argus.core:alert.evaluation.kpi.*{host=bar, tagA=*}:min
the trigger name can be written as
trigger-${scope}-${metric}-${tag.host}-${tag.tagA}
If the trigger fires on a metric called “alert.evaluation.kpi.foo”, and tagA has the value “baz”, notifications will show the trigger name as
trigger-argus.core-alert.evaluation.kpi.foo-bar-baz
You can create a template like the following and use it in your Notification tab's Custom Text fields to produce more uniform and informative alert messages:
Alert Name = ${alert.name?upper_case},
Alert Expression = ${alert.expression},
Alert cronEntry = ${alert.cronEntry},
Alert enabled = ${alert.enabled?then('alert enabled', 'alert not enabled')},
Alert Expression = ${alert.expression},
Trigger Name = ${trigger.name},
Trigger type = ${trigger.type},
Trigger threshold = ${trigger.threshold},
Trigger secondaryThreshold = ${trigger.secondaryThreshold},
Trigger Inertia = ${trigger.inertia},
Trigger Value = ${triggerValue},
Trigger Timestamp = ${triggerTimestamp?datetime?iso('GMT')},
Notification Name = ${notification.name?cap_first},
Notification cooldownPeriod = ${notification.cooldownPeriod},
Notification SRActionable = ${notification.SRActionable?then('SR Actionable','Not SR Actionable')},
Notification severityLevel = ${notification.severityLevel}
The template will produce alert messages with information similar to the following:
Alert Name = ALERT-123-argus.core,
Alert Expression = -1h:argus.core:alert.evaluation.kpi{host=*}:avg,
Alert cronEntry = * * * * *,
Alert enabled = alert enabled,
Alert Expression = -1h:argus.core:alert.evaluation.kpi{host=*}:avg,
Trigger Name = trigger-1234-argus.core,
Trigger type = GREATER_THAN,
Trigger threshold = 60,
Trigger secondaryThreshold = 0,
Trigger Inertia = 10,000,
Trigger Value = 240,000,
Trigger Timestamp = 2018-10-05T21:15:00Z,
Notification Name = New-notification-1531947321887,
Notification cooldownPeriod = 10,
Notification SRActionable = Not SR Actionable,
Notification severityLevel = 2
Note the conditional statements in the template ("Alert enabled" and "SRActionalble"). FreeMarker conditional statements are fully supported, and can be used to generate richer alert messages. For example, this template
<#if trigger.threshold <= 4> Primary Threshold is less than 4 </#if>,
<#if (trigger.secondaryThreshold == 7.1)> Secondary Threshold is 7.1 </#if>,
<#if trigger.inertia == 5 && (trigger.threshold > 5)> Inertia is 5, Primary Threshold more than 5 <#elseif (trigger.threshold > 5)>Primary Threshold more than 5 <#elseif trigger.inertia == 5> Inertia is 5 </#if>,
<#if trigger.name?matches('trigger_name') && triggerValue < 2.0> Trigger name matches and trigger value is < 1 </#if>,
<#if triggerValue?round == 2> Trigger fired, rounded value is 2 </#if>,
<#assign dt = triggerTimestamp?datetime> Trigger fired date-time: ${dt?iso('GMT')},
Time before 2.5 hrs of firing: ${dt?iso('GMT-02:30')}
could result in the following
Primary Threshold is less than 4 ,
Secondary Threshold is 7.1 ,
Inertia is 5 ,
Trigger name matches and trigger value is < 1 ,
Trigger fired rounded value is 2 ,
Trigger fired ate-time: 2014-12-11T17:40:00Z,
Time before 2.5 hrs of firing: 2014-12-11T15:10:00-02:30
See the FreeMarker Template Language Reference for complete documentation on FreeMarker directives, built-ins, and more.