Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More issues with Thanos Querier #93

Open
naved001 opened this issue Dec 10, 2024 · 0 comments
Open

More issues with Thanos Querier #93

naved001 opened this issue Dec 10, 2024 · 0 comments

Comments

@naved001
Copy link
Collaborator

The issue this time was that the metrics didn't include the node for a scheduled pod.

The pod is mma-ai-7489bc5b98-ccf64 in the namespace nerc-demo-5b7ce1.

If you query kube_pod_resource_request{unit="cores"} unless on(pod, namespace) kube_pod_status_unschedulable for December 9, 2024 from the thanos querier endpoint, then it doesn't include the node name. The node name isn't a problem for cpu metrics, but it is a problem for GPU metrics. I then ran the same query to gather data from the prometheus endpoint which correctly returned the associated node.

As a solution, I was thinking I could do something like

kube_pod_resource_request{unit="cores", node!=""} unless on(pod, namespace) kube_pod_status_unschedulable

but this just explicitly ignores pods without a node name and is not ideal.

This is the 3rd issue that I've run into when trying to gather data from thanos which didn't affect prometheus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant