This monitoring section is monitoring tool/platform agnostic. The structure of textual definition for each alert is:
- A description of a relevant metric to monitor
- The rule(s) to trigger alert(s) on that metric
- Operator's action if/when the alert is triggered
For the purposes of the monitoring section contained in this repo, implementing the rules in the specific monitoring tool/platform is an operator's responsibility.