Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking: refactor metrics with LabelGuarded #14838

Open
5 of 6 tasks
fuyufjh opened this issue Jan 29, 2024 · 6 comments
Open
5 of 6 tasks

tracking: refactor metrics with LabelGuarded #14838

fuyufjh opened this issue Jan 29, 2024 · 6 comments
Assignees
Milestone

Comments

@fuyufjh
Copy link
Member

fuyufjh commented Jan 29, 2024

Background

LabelGuardedMetricVec was introduced in #13080. It enhances the MetricVec to ensure the set of labels to be correctly removed from the Prometheus client once being dropped. This is useful for metrics that are associated with an object that can be dropped, such as streaming jobs, fragments, actors, batch tasks, etc.

When a set labels is dropped, it will record it in the uncollected_removed_labels set. Once the metrics has been collected, it will finally remove the metrics of the labels.

To-dos

Technically, all usages of plain MetricVec of a drop-able object (streaming jobs, fragments, actors, batch tasks, etc.) need to be replaced with LabelGuardedMetricVec

@github-actions github-actions bot added this to the release-1.7 milestone Jan 29, 2024
@fuyufjh fuyufjh changed the title refactor: metrics with LabelGuarded tracking: refactor metrics with LabelGuarded Jan 29, 2024
@BugenZhao
Copy link
Member

Could be related: #13086

@fuyufjh
Copy link
Member Author

fuyufjh commented Mar 6, 2024

related #14821

@xxchan
Copy link
Member

xxchan commented Mar 7, 2024

So currently when a streaming job is dropped, it's metrics will be leaked (i.e., prometheus collected some useless data, which is always zero valued), right?

@fuyufjh
Copy link
Member Author

fuyufjh commented Mar 7, 2024

So currently when a streaming job is dropped, it's metrics will be leaked (i.e., prometheus collected some useless data, which is always zero valued), right?

True. Part of them have been fixed (for example, check the StreamingMetrics). Anyone taking this issue please help to check whether the remaining usage are correct.

@lmatz
Copy link
Contributor

lmatz commented Mar 7, 2024

which is always zero valued

I have observed some non-zero constant values on the Grafana, although I am not sure if it is the same root cause

@xxchan
Copy link
Member

xxchan commented Mar 8, 2024

Yes, should be constant. Not necessarily zero.

Can examine by checking localhost:1222. e.g., stream_mview_input_row_count for a dropped actor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants