Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify and implement business critical metrics / KPIs, define an action plan and configure alerting rules #83

Open
7 tasks
tobiscr opened this issue Dec 27, 2023 · 0 comments

Comments

@tobiscr
Copy link

tobiscr commented Dec 27, 2023

Description

With #28 we are able to make the compass manager transparent and also simplify our operational life by establishing smart metrics and alerting rules.

Goals of this task is to identify which metrics / KPIs are business relevant and what the critical threshold for it are. We also have to define an action plan when such a threshold is reached which trigger a required action to bring our business back on track. Finally, alerting rules have to be configured which inform us as soon as one of the thresholds is reached.

AC:

  • Think about technical and business critical metrics / KPIs which give a clear indication of the quality and health of the Compass Manager => get in touch with SREs to identify missing alerts/critical metrics
    • Define the reason why this metric is relevant and what it represents.
    • Define the threshold (min <> max etc.) which indicate an service degradation or health issue of the Compass Manager. If a metric has no threshold, verify if it's for us still helpful to measure this value.
    • Specify the required action that has to be applied if a threshold is reached to recover the Compass Manager into a productive and healthy state
    • Present the results in the team to collect the feedback of the colleagues.
  • Implement the identify business metrics in the Compass Manager
  • Configure alerting rules which inform the team as soon as one of the thresholds is reached

Reasons

Improve operational quality and simplify on-call shifts by establish proper metrics/KPI measuring and alerting.

Attachments

@tobiscr tobiscr changed the title Setup business critical metrics and alerting Setup business critical metrics, define an action plan and configure alerting rules Dec 27, 2023
@tobiscr tobiscr changed the title Setup business critical metrics, define an action plan and configure alerting rules Identify and implement business critical metrics / KPIs, define an action plan and configure alerting rules Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant