Draft: kaas logging monitoring tracing #656

tonifinger · 2024-07-10T09:25:37Z

Add a decision record to configure monitoring for the SCS KaaS layer.

Addresses: SovereignCloudStack/issues#418

…g/Tracing" Signed-off-by: Toni Finger <[email protected]>

Signed-off-by: Toni Finger <[email protected]>

Standards/scs-0219-v1-k8s-monitoring-logging-tracing.md

michal-gubricky · 2024-10-31T12:12:15Z

Standards/scs-0219-v1-k8s-monitoring-logging-tracing.md

+
+SCS KaaS infrastructure monitoring SHOULD be used as a diagnostic tool to alert operators and end users to system-related issues by analyzing metrics.
+Therefore, it includes the collection and visualization of the corresponding metrics.
+Optionally, an alerting mechanism COULD also be standardized.


IMO, we also need alerting to be mandatory when we say:

SCS KaaS infrastructure monitoring SHOULD be used as a diagnostic tool to alert operators and end users...

michal-gubricky · 2024-10-31T12:25:30Z

Standards/scs-0219-v1-k8s-monitoring-logging-tracing.md

+
+This concept SHALL define monitoring and logging in a federated structure.
+Therefore, a monitoring and logging stack MUST be deployed on each k8s cluster. 
+A central monitoring can then fetch data from the clusters individual monitoring stacks.


I would suggest rephrasing this sentence something to like this:

A central monitoring system can then fetch data from the individual clusters' monitoring stacks to Grafana to visualize the collected metrics.

michal-gubricky · 2024-10-31T12:29:44Z

Standards/scs-0219-v1-k8s-monitoring-logging-tracing.md

+#### Kubernetes Metric Server 
+
+Kubernetes provides a source for container resource metrics. 
+The main purpose of this source is to be used for Kubernetes' built-in auto-scaling [kubernetes-metrics-server][kubernetes-metrics-server-repo].


kubernetes-metrics-server-repo is missing at the end of the file where other links are defined.

michal-gubricky · 2024-10-31T12:34:23Z

Standards/scs-0219-v1-k8s-monitoring-logging-tracing.md

+
+#### Prometheus Operator
+
+One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus


Suggested change

One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus

One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus.

michal-gubricky · 2024-10-31T13:18:05Z

Standards/scs-0219-v1-k8s-monitoring-logging-tracing.md

+Therefore, every k8s cluster CLOUD have a [prometheus-operator][prometheus-operator] deployed to all control plane clusters as an optional default.
+The operator SHOULD at least be rolled out to all control plane nodes.


Can you please explain what has been meant by these two sentences? Something like that prometheus operator should/could be deployed at each k8s cluster and prometheus-node-exported should at least be rolled out on control plane nodes?

Co-authored-by: Michal Gubricky <[email protected]> Signed-off-by: tonifinger <[email protected]>

tonifinger added 4 commits March 27, 2024 10:15

Inital development of decision record on "KubernetesLogging/Monitorin…

116bd82

…g/Tracing" Signed-off-by: Toni Finger <[email protected]>

Added more detail regarding "KubernetesLogging/Monitoring/Tracing"

838f32a

Signed-off-by: Toni Finger <[email protected]>

Add proposal to standardize the use of the Kubernetes metrics server

7cde0eb

Signed-off-by: Toni Finger <[email protected]>

Adding additional information

eb64ae2

Signed-off-by: Toni Finger <[email protected]>

tonifinger added the SCS-VP10 Related to tender lot SCS-VP10 label Jul 10, 2024

tonifinger linked an issue Jul 10, 2024 that may be closed by this pull request

[Standardization] KaaS Logging/Monitoring/Tracing SovereignCloudStack/issues#418

Open

2 tasks

tonifinger self-assigned this Jul 10, 2024

anjastrunk requested a review from piobig2871 September 4, 2024 11:57

michal-gubricky self-requested a review October 8, 2024 13:59

michal-gubricky reviewed Oct 31, 2024

View reviewed changes

Update Standards/scs-0219-v1-k8s-monitoring-logging-tracing.md

cd53e9d

Co-authored-by: Michal Gubricky <[email protected]> Signed-off-by: tonifinger <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: kaas logging monitoring tracing #656

Draft: kaas logging monitoring tracing #656

tonifinger commented Jul 10, 2024

michal-gubricky Oct 31, 2024

michal-gubricky Oct 31, 2024

michal-gubricky Oct 31, 2024

michal-gubricky Oct 31, 2024

michal-gubricky Oct 31, 2024


		#### Prometheus Operator

		One of the most commonly used monitoring tools in connection with Kubernetes is Prometheus

		Therefore, every k8s cluster CLOUD have a [prometheus-operator][prometheus-operator] deployed to all control plane clusters as an optional default.
		The operator SHOULD at least be rolled out to all control plane nodes.

Draft: kaas logging monitoring tracing #656

Are you sure you want to change the base?

Draft: kaas logging monitoring tracing #656

Conversation

tonifinger commented Jul 10, 2024

michal-gubricky Oct 31, 2024

Choose a reason for hiding this comment

michal-gubricky Oct 31, 2024

Choose a reason for hiding this comment

michal-gubricky Oct 31, 2024

Choose a reason for hiding this comment

michal-gubricky Oct 31, 2024

Choose a reason for hiding this comment

michal-gubricky Oct 31, 2024

Choose a reason for hiding this comment