You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This metric (awscni_add_ip_req_count) is exported as a gauge but it has cumulative incremental values. In fact, it seems that it's used as a counter in:
How to reproduce it (as minimally and precisely as possible):
Using Prometheus exporters.
Anything else we need to know?:
This may not be a critical issues if systems use Prometheus as the backend. However, it becomes a problem when Prometheus metrics are transformed into other representations. For example, OpenTelemetry Collectors will read this as a Gauge and that gives the aggregation a different meaning (e.g. one can change temporality of counters from cumulative to delta or viceversa).
Environment:
Kubernetes version (use kubectl version): 1.28.12
CNI Version: 1.16.3
OS (e.g: cat /etc/os-release): Bottlerocket 1.21.0
Kernel (e.g. uname -a): x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered:
What happened:
Some of the Prometheus metrics exported by the VPC CNI plugin are defined with inaccurate metric types. For example:
amazon-vpc-cni-k8s/utils/prometheusmetrics/prometheusmetrics.go
Line 64 in 27ce136
This metric (
awscni_add_ip_req_count
) is exported as a gauge but it has cumulative incremental values. In fact, it seems that it's used as a counter in:amazon-vpc-cni-k8s/pkg/ipamd/rpc_handler.go
Line 70 in 27ce136
It seems that
awscni_del_ip_req_count
is correctly exported as a counter.I probably don't have enough context on this to make a judgement call. However, I think there are probably more Gauges that are operating as Counters.
Attach logs
N/A
What you expected to happen:
I'd expect metrics to follow the semantic conventions defined in https://prometheus.io/docs/concepts/metric_types/
How to reproduce it (as minimally and precisely as possible):
Using Prometheus exporters.
Anything else we need to know?:
This may not be a critical issues if systems use Prometheus as the backend. However, it becomes a problem when Prometheus metrics are transformed into other representations. For example, OpenTelemetry Collectors will read this as a Gauge and that gives the aggregation a different meaning (e.g. one can change temporality of counters from cumulative to delta or viceversa).
Environment:
kubectl version
): 1.28.12cat /etc/os-release
): Bottlerocket 1.21.0uname -a
): x86_64 GNU/LinuxThe text was updated successfully, but these errors were encountered: