Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: kaas logging monitoring tracing #656

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

tonifinger
Copy link
Contributor

Add a decision record to configure monitoring for the SCS KaaS layer.

Addresses: SovereignCloudStack/issues#418

@tonifinger tonifinger added the SCS-VP10 Related to tender lot SCS-VP10 label Jul 10, 2024
@tonifinger tonifinger linked an issue Jul 10, 2024 that may be closed by this pull request
2 tasks
@tonifinger tonifinger self-assigned this Jul 10, 2024
Standards/scs-0219-v1-k8s-monitoring-logging-tracing.md Outdated Show resolved Hide resolved

SCS KaaS infrastructure monitoring SHOULD be used as a diagnostic tool to alert operators and end users to system-related issues by analyzing metrics.
Therefore, it includes the collection and visualization of the corresponding metrics.
Optionally, an alerting mechanism COULD also be standardized.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we also need alerting to be mandatory when we say:

SCS KaaS infrastructure monitoring SHOULD be used as a diagnostic tool to alert operators and end users...


This concept SHALL define monitoring and logging in a federated structure.
Therefore, a monitoring and logging stack MUST be deployed on each k8s cluster.
A central monitoring can then fetch data from the clusters individual monitoring stacks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest rephrasing this sentence something to like this:

A central monitoring system can then fetch data from the individual clusters' monitoring stacks to Grafana to visualize the collected metrics.

#### Kubernetes Metric Server

Kubernetes provides a source for container resource metrics.
The main purpose of this source is to be used for Kubernetes' built-in auto-scaling [kubernetes-metrics-server][kubernetes-metrics-server-repo].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubernetes-metrics-server-repo is missing at the end of the file where other links are defined.


#### Prometheus Operator

One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus
One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus.

Comment on lines +64 to +65
Therefore, every k8s cluster CLOUD have a [prometheus-operator][prometheus-operator] deployed to all control plane clusters as an optional default.
The operator SHOULD at least be rolled out to all control plane nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain what has been meant by these two sentences? Something like that prometheus operator should/could be deployed at each k8s cluster and prometheus-node-exported should at least be rolled out on control plane nodes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SCS-VP10 Related to tender lot SCS-VP10
Projects
Status: Doing
Development

Successfully merging this pull request may close these issues.

[Standardization] KaaS Logging/Monitoring/Tracing
2 participants