-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: kaas logging monitoring tracing #656
base: main
Are you sure you want to change the base?
Conversation
…g/Tracing" Signed-off-by: Toni Finger <[email protected]>
Signed-off-by: Toni Finger <[email protected]>
Signed-off-by: Toni Finger <[email protected]>
Signed-off-by: Toni Finger <[email protected]>
|
||
SCS KaaS infrastructure monitoring SHOULD be used as a diagnostic tool to alert operators and end users to system-related issues by analyzing metrics. | ||
Therefore, it includes the collection and visualization of the corresponding metrics. | ||
Optionally, an alerting mechanism COULD also be standardized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we also need alerting to be mandatory when we say:
SCS KaaS infrastructure monitoring SHOULD be used as a diagnostic tool to alert operators and end users...
|
||
This concept SHALL define monitoring and logging in a federated structure. | ||
Therefore, a monitoring and logging stack MUST be deployed on each k8s cluster. | ||
A central monitoring can then fetch data from the clusters individual monitoring stacks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest rephrasing this sentence something to like this:
A central monitoring system can then fetch data from the individual clusters' monitoring stacks to Grafana to visualize the collected metrics.
#### Kubernetes Metric Server | ||
|
||
Kubernetes provides a source for container resource metrics. | ||
The main purpose of this source is to be used for Kubernetes' built-in auto-scaling [kubernetes-metrics-server][kubernetes-metrics-server-repo]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubernetes-metrics-server-repo
is missing at the end of the file where other links are defined.
|
||
#### Prometheus Operator | ||
|
||
One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus | |
One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus. |
Therefore, every k8s cluster CLOUD have a [prometheus-operator][prometheus-operator] deployed to all control plane clusters as an optional default. | ||
The operator SHOULD at least be rolled out to all control plane nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain what has been meant by these two sentences? Something like that prometheus operator should/could be deployed at each k8s cluster and prometheus-node-exported should at least be rolled out on control plane nodes?
Co-authored-by: Michal Gubricky <[email protected]> Signed-off-by: tonifinger <[email protected]>
Add a decision record to configure monitoring for the SCS KaaS layer.
Addresses: SovereignCloudStack/issues#418