Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric API proposal to cover typical Kubernetes metrics #1001

Closed
a-thaler opened this issue Apr 22, 2024 · 2 comments
Closed

Metric API proposal to cover typical Kubernetes metrics #1001

a-thaler opened this issue Apr 22, 2024 · 2 comments
Assignees
Labels
area/metrics MetricPipeline kind/decision Marks a decision document

Comments

@a-thaler
Copy link
Collaborator

a-thaler commented Apr 22, 2024

Description
As outlined in #972 typical metrics should be easily collactable so that typical kubernetes workload monitoring gets possible. The metrics should be based on the kubletstatsreceiver and k8sclusterreceiver only.
A concrete API proposal is needed on how to enable the metric collection from user side. Hereby, you need to think about which metrics you are usually enabling together as they are always used in combination. Also, the selection via namespaces should be applied to namespace-typical metrics only.
The input name should trigger the right expectations.

Criterias

  • Concrete API proposal for selecting the typical metrics, outlining which chunk of metrics are getting enabled by which input
  • Users should be able to limit metrics to certain namespaces
  • Users should be able to enable bigger chunks selectively

Ideas

  input:
    cluster:
      enabled: true
      namespaces:
        include:
        - myNamespace
    host:
      enabled: true
    runtime:
      enabled: true
      namespaces:
        include:
        - myNamespace

Reasons

Attachments

Release Notes


@a-thaler a-thaler added area/metrics MetricPipeline kind/decision Marks a decision document labels Apr 22, 2024
@a-thaler a-thaler changed the title Metric API PoC to cover typical Kubernetes metrics Metric API proposal to cover typical Kubernetes metrics Apr 22, 2024
@chrkl chrkl self-assigned this Jun 4, 2024
@chrkl
Copy link
Contributor

chrkl commented Jun 7, 2024

Our discussions showed that it is hard to split the metrics from k8sclusterreceiver and kubeletstatsreceiver into the three sections cluster, host, and runtime. We rather propose to have a single input section and give the user the option to chose different resources to be included in the metric output:

input:
  runtime:
    enabled: true
    resources:
      pod:
        enabled: true
      container:
        enabled: true
      node:
        enabled: true
      volume:
        enabled: true        
      daemonset:
        enabled: false
      deployment:
        enabled: false
      statefulset:
        enabled: false
      quota:
        enabled: false
      job:
        enabled: false
      hpa:
        enabled: false
    namespaces:
      include:
      - myNamespace

The individual resources should include the following metrics:

pod

k8s.pod.cpu.time (kubeletstats)
k8s.pod.cpu.utilization (kubeletstats)
k8s.pod.filesystem.available (kubeletstats)
k8s.pod.filesystem.capacity (kubeletstats)
k8s.pod.filesystem.usage (kubeletstats)
k8s.pod.memory.available (kubeletstats)
k8s.pod.memory.usage (kubeletstats)
k8s.pod.network.errors (kubeletstats)
k8s.pod.network.io (kubeletstats)
k8s.pod.cpu.usage (kubeletstats)
k8s.pod.phase (k8scluster)

container

k8s.container.cpu_limit (k8scluster)
k8s.container.cpu_request (k8scluster)
k8s.container.ephemeralstorage_limit (k8scluster)
k8s.container.ephemeralstorage_request (k8scluster)
k8s.container.memory_limit (k8scluster)
k8s.container.memory_request (k8scluster)
k8s.container.restarts (k8scluster)
container.cpu.utilization (kubeletstats)
container.cpu.usage (kubeletstats)
container.cpu.time (kubeletstats)
container.filesystem.available (kubeletstats)
container.filesystem.capacity (kubeletstats)
container.filesystem.usage (kubeletstats)
container.memory.usage (kubeletstats)
container.cpu.usage (kubeletstats)

node

k8s.node.cpu.utilization (kubeletstats)
k8s.node.cpu.usage (kubeletstats)
k8s.node.filesystem.available (kubeletstats)
k8s.node.filesystem.capacity (kubeletstats)
k8s.node.filesystem.usage (kubeletstats)
k8s.node.memory.available (kubeletstats)
k8s.node.network.errors (kubeletstats)
k8s.node.network.io (kubeletstats)

volume

k8s.volume.available (kubeletstats)
k8s.volume.capacity (kubeletstats)

daemonset

k8s.daemonset.current_scheduled_nodes (k8scluster)
k8s.daemonset.desired_scheduled_nodes (k8scluster)
k8s.daemonset.misscheduled_nodes (k8scluster)
k8s.daemonset.ready_nodes (k8scluster)

deployment

k8s.deployment.available (k8scluster)
k8s.deployment.desired (k8scluster)

statefulset

k8s.statefulset.current_pods (k8scluster)
k8s.statefulset.desired_pods (k8scluster)
k8s.statefulset.ready_pods (k8scluster)
k8s.statefulset.updated_pods (k8scluster)

quota

k8s.resource_quota.hard_limit (k8scluster)
k8s.resource_quota.used (k8scluster)

job

k8s.cronjob.active_jobs (k8scluster)
k8s.job.active_pods (k8scluster)
k8s.job.desired_successful_pods (k8scluster)
k8s.job.failed_pods (k8scluster)
k8s.job.max_parallel_pods (k8scluster)
k8s.job.successful_pods (k8scluster)

hpa

k8s.hpa.current_replicas (k8scluster)
k8s.hpa.desired_replicas (k8scluster)
k8s.hpa.max_replicas (k8scluster)
k8s.hpa.min_replicas (k8scluster)

The bold marked metrics are already part of the runtime input (release 1.17).

@chrkl
Copy link
Contributor

chrkl commented Jun 10, 2024

The shown proposal will be implemented as MetricPipeline input in a follow up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics MetricPipeline kind/decision Marks a decision document
Projects
None yet
Development

No branches or pull requests

2 participants