Proposal: Add type and unit metadata labels #39

dashpole · 2024-10-24T15:20:13Z

Some screenshots from the Prometheus UI after this PoC: prometheus/prometheus#15683

The PoC updates all PromQL tests to demonstrate that adding type and unit labels doesn't break existing queries.

Querying for `unit`

Signed-off-by: David Ashpole <[email protected]>

ArthurSens

Great start! I think we need to discuss a bit more whether to handle __* labels in the PromQL engine or the UI. Maybe there's a chance we're doing a breaking change...?

proposals/2024-09-25_metadata-labels.md

Signed-off-by: David Ashpole <[email protected]>

beorn7

Thank you very much for proposing this. I love it in general, and as said elsewhere, I had a very similar idea a while ago for completely different reasons (mostly to avoid mixed series with native histograms and float-based sample types). Finding a similar solution for unrelated problems seems like a signal that this could be actually useful. However, I have also run into some conundrums, which was the reason why I put the whole idea on the backburner. I'll try to sketch out my train of thought here:

The __name__ label is the precedent here for "special labels", and as such it gives us a hint of what the issues might be. It's special power is not just about how to display it. It is treated specially for label matching and aggregation, and it is removed by most operations, following the argument that a rate of process_cpu_seconds_total is not process_cpu_seconds_total anymore. We have to ask ourselves the same questions for __unit__ and __type__.

In the brave new world, process_cpu_seconds_total would probably look like process_cpu{__type__="counter", __unit__="seconds"}. If we rate that, the outcome is not a counter anymore (but a gauge), and it is actually unit-less (seconds per second, i.e. it is the CPU usage ratio. In wishful thinking, the outcome should be process_cpu_usage{__type__="gauge", __unit__="ratio"}. Of course, we cannot easily come up with a general procedure to make up new names (which is the reason why current PromQL simply removes the name), and even changing the unit is tough. Changing the type might actually be feasible (we explored that a bit with native histograms, which "know" if they are a counter histogram or a gauge histogram). So maybe we can come up with "type translation rules" for all PromQL operations, and then maybe drop the unit in doubt (when calculating a rate or multiplying) and keep it when it makes sense (a sum aggregation or adding). But you will notice how deep we are in the woods here already.

The next thing is aggregation and label matching. As said, __name__ is also special here, but we probably need to be special in different ways for __unit__ and __type__. For example if we simply do a + b, we probably want to include unit and type in the label match (only add gauges to gauges, only add seconds to seconds etc.). However, a * b is already very different. b might be a scaling factor, i.e. a unit-less gauge. a could be counters or histograms. So in this case, we would like to exclude the unit and type from the label match.

It gets very confusing quickly. I wish we could just add the __unit__ and __type__ as an experiment without further special treatment and see how people cope with it. But right now my worry is it will hit too many road bumps…

proposals/2024-09-25_metadata-labels.md

beorn7 · 2024-11-12T10:37:40Z

It gets very confusing quickly. I wish we could just add the __unit__ and __type__ as an experiment without further special treatment and see how people cope with it. But right now my worry is it will hit too many road bumps…

Still in brainstorming mode, here is an idea how maybe a step zero could look like: Add the __unit__ and __type__ upon ingestion, but mostly ignore them in PromQL operations, maybe with the following exceptions:

__unit__ and __type__ can be used in label selectors. This allows the filtering (listed as a goal in this doc) and could also be used to increase readability of a PromQL expression.
__unit__ and __type__ can still be added via label_replace, e.g. to improve display of a result or to give the result of recording rules the unit and type information.

However, aggregations and label matches would ignore __unit__ and __type__ and would therefore work as before, and any operation would remove the __unit__ and __type__ label (with the exception of label_replace), both meant to circumvent any of the issues laid out above.

From there, we could cautiously move forward with more ideas. A step one might then be to handle __type__ in a more "native" fashion, i.e. attaching the "correct" __type__ label depending on the operation, still allowing the user to override via label_replace (like hard-casting a type in C), but maybe enforce a histogram type on native histograms (and prevent such a type on floats), so that we would avoid mixed series. Also, taking __type__ into account during aggregations might be possible.

Signed-off-by: David Ashpole <[email protected]>

bwplotka · 2024-11-12T21:09:49Z

unit and type can still be added via label_replace, e.g. to improve display of a result or to give the result of recording rules the unit and type information.

Hm, I think it would be also nice to display them when collision actually occurs 🤔

Signed-off-by: David Ashpole <[email protected]>

bwplotka

Nice work! Looking good, some comments.

proposals/2024-09-25_metadata-labels.md

Signed-off-by: David Ashpole <[email protected]>

dashpole · 2024-11-26T03:47:47Z

@beorn7 I've added "Aggregations and label matches ignore __unit__ and __type__ and any operation would remove the __unit__ and __type__ label (with the exception of label_replace)." based on your suggestion above. I've also added "Handle type and unit in PromQL operations" as a potential future extension.

beorn7 · 2024-11-26T15:46:05Z

Quick note about federation: To my knowledge, federation still completely ignores metadata, so the exposition format it generates has all metric types as "untyped" and no units and help strings anyway, simply because Prometheus doesn't know about metadata post-ingestion. So being able to funnel unit and type through the Prometheus TSDB via labels would be a direct improvement for federation.

ArthurSens · 2024-11-26T22:54:11Z

Quick note about federation: To my knowledge, federation still completely ignores metadata, so the exposition format it generates has all metric types as "untyped" and no units and help strings anyway, simply because Prometheus doesn't know about metadata post-ingestion. So being able to funnel unit and type through the Prometheus TSDB via labels would be a direct improvement for federation.

I think it would be better to enable metadata post-scrape/federation than adding type and unit as labels in the exposition format. See #39 (comment)

beorn7 · 2024-11-27T11:58:02Z

Quick note about federation: To my knowledge, federation still completely ignores metadata, so the exposition format it generates has all metric types as "untyped" and no units and help strings anyway, simply because Prometheus doesn't know about metadata post-ingestion. So being able to funnel unit and type through the Prometheus TSDB via labels would be a direct improvement for federation.

I think it would be better to enable metadata post-scrape/federation than adding type and unit as labels in the exposition format. See #39 (comment)

The problem is that federation breaks the assumption that all metrics of the same name (one metric family) all have the same metadata.

bwplotka · 2024-11-27T12:57:13Z

The problem is that federation breaks the assumption that all metrics of the same name (one metric family) all have the same metadata.

Yea, I wonder if we could lift this with OM 2.0. WDYT @beorn7 ?

Related discussion: #39 (comment)

beorn7 · 2024-11-27T13:28:05Z

The problem is that federation breaks the assumption that all metrics of the same name (one metric family) all have the same metadata.

Yea, I wonder if we could lift this with OM 2.0. WDYT @beorn7 ?

The current structure of both protobuf and text versions of OM doesn't really lend itself to multiple different type, unit, or help entries for metrics with the same name. So we needed a new structural way of specifying unit and type (let's ignore help for now) per metric (rather than per metric family). Ironically, we already have a place for things that are per metric. We call it labels. So let's put type and metric into labels? Wait… that's what this proposal actually proposes. 😁

bwplotka · 2024-11-27T15:06:45Z

Well yes, but we just discussed that type and unit per label will not be easily parsable in OM/text, so putting it somewhere else actually helps. There might be a big questions there:

Is Metric Family concept actually useful these days? Should OM 2.0 simply redefine it, and flat that out? Personally it's only confusing, we don't use this semantic on the storage or PromQL (AFAIK). 💣
What's wrong to keep metric families but make them unique per type and unit? (not only name)?

beorn7 · 2024-11-27T16:23:39Z

Well yes, but we just discussed that type and unit per label will not be easily parsable in OM/text, so putting it somewhere else actually helps.

I wouldn't say that this is actually the case. The way we parse the labels doesn't really create a lot of parsing cost if there are more labels. It does create more payload, but not so much in relative terms. The OM text format is designed to have a lot of repetitive information anyway. If we want to avoid repetitive information, we needed a fundamentally different structure.

Is Metric Family concept actually useful these days? Should OM 2.0 simply redefine it, and flat that out? Personally it's only confusing, we don't use this semantic on the storage or PromQL (AFAIK). 💣

It correctly models the per-scrape data model. It avoids (a part of the) repetition. Of course you can flat that out, but then you get the repetition that was marked above as "not easily parseable". My claim is that "flattening that out" is pretty much equivalent to "adding that as labels".

What's wrong to keep metric families but make them unique per type and unit? (not only name)?

Well, the text format originally was designed to not be ordered in any way. So every line was keyed by the metric family name, i.e. a metric line starts with the metric family name, and all the metadata lines contain the metric family name as the 2nd token after the hash. OM became somewhat dependent on order, but not consequently so (you could avoid a lot of repetition if you did that) (another of the design problems I see with OM). With the original idea of the text format structure, you needed to repeat not just the metric family name on each line, but the combination of metric family name, type, and unit (which, you guessed it, is equivalent to putting type and unit into a label). Alternatively, we could redesign the text format fundamentally and make it really depend on the order everywhere, but then we should do it thoroughly and avoid other repetitions, too (at the very least the metric family name).

Protobuf is a bit different as it is structured, and all the family members are a repeated Metric message inside a MetricFamily message. So keying on more than just the name is relatively easy to accomplish. This is another reason why I think we should "simply" solve the protobuf parsing performance issue and then go back to "if you want efficiency, use protobuf – if you want simplicity, use text" (simplicity as in "easy to create in a valid form", which means kicking out many of the requirements OM text introduced on top of the original text format, like order, whitespace, …).

bwplotka · 2024-11-28T13:10:15Z

The OM text format is designed to have a lot of repetitive information anyway. If we want to avoid repetitive information, we needed a fundamentally different structure.

I thought it's more the fact we usually need to know what's the metric type (and metric family name) ahead of time (e.g. histograms have a different flow). But you are right, perhaps we could make label based approach work fine. Then it's only human readability question and what's defined in SDK and what's queried.

It correctly models the per-scrape data model.

I think I got it now, FAIK this only makes a difference for unstructured types like classic histogram and summary where metric name you define in SDKs (aka metric family name) is != resulting metric name. In the word with native histograms and perhaps native summaries one day, it would make no different, right?

My claim is that "flattening that out" is pretty much equivalent to "adding that as labels".

Agree.

Well, the text format originally was designed to not be ordered in any way. So every line was keyed by the metric family name, i.e. a metric line starts with the metric family name, and all the metadata lines contain the metric family name as the 2nd token after the hash. OM became somewhat dependent on order, but not consequently so (you could avoid a lot of repetition if you did that) (another of the design problems I see with OM). With the original idea of the text format structure, you needed to repeat not just the metric family name on each line, but the combination of metric family name, type, and unit (which, you guessed it, is equivalent to putting type and unit into a label). Alternatively, we could redesign the text format fundamentally and make it really depend on the order everywhere, but then we should do it thoroughly and avoid other repetitions, too (at the very least the metric family name).

Got it, thank you for detailed explanation! I think we should discuss those options in a clear proposal, make some decision on this in OM 2.0, will add issue for this.

Protobuf is a bit different as it is structured, and all the family members are a repeated Metric message inside a MetricFamily message. So keying on more than just the name is relatively easy to accomplish. This is another reason why I think we should "simply" solve the prometheus/prometheus#14668 and then go back to "if you want efficiency, use protobuf – if you want simplicity, use text" (simplicity as in "easy to create in a valid form", which means kicking out many of the requirements OM text introduced on top of the original text format, like order, whitespace, …).

Nice! Proposing as an intention in our OM 2.0 doc

ArthurSens

Some very small comments and questions. Overall LGTM

Thanks Bartek, David and everyone else who worked on this ❤️

proposals/2024-09-25_metadata-labels.md

krajorama

LGTM with one comment on clarifying what remote write client does (or point me to where it's already defined :) )

proposals/2024-09-25_metadata-labels.md

Co-authored-by: Carrie Edwards <[email protected]> Signed-off-by: Arthur Silva Sens <[email protected]>

fionaliao

Overall LGTM, but I had the same question as Arthur around whether user-provided type and unit labels are always removed like in the case of RW 1.0 (#39 (comment)).

beorn7

I'm generally fine with the content. My comments are just about improving wording or fix some technicalities that do not change the core message of this document.

proposals/2024-09-25_metadata-labels.md

Co-authored-by: Björn Rabenstein <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]>

Signed-off-by: bwplotka <[email protected]>

bwplotka · 2025-04-23T10:08:39Z

Addressed all, thanks for an amazing review! @beorn7 @ArthurSens @carrieedwards @fionaliao @krajorama

@fionaliao Overall LGTM, but I had the same question as Arthur around whether user-provided type and unit labels are always removed like in the case of RW 1.0

Yes, I updated one section to make it clear and clarified things in #39 (comment)

ArthurSens

LGTM

proposals/2024-09-25_metadata-labels.md

Co-authored-by: Arthur Silva Sens <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]>

proposals/2024-09-25_metadata-labels.md

Signed-off-by: Bartlomiej Plotka <[email protected]>

Mentioned in DMs it's a LGTM once things were addressed, and they are now.

bwplotka · 2025-04-24T17:20:39Z

I think we should be good to go here! 👍🏽

Bartek Plotka 16:06
Just to schedule on your todo list (again): Are we good to merge #39 (review)? (type and unit proposal)

Björn Rabenstein 17:29 @beorn7
I'm not sure when I get to it again. But all my comments were not changing anything in the core, they were just about some peripheral understanding, so if you feel you have addressed my comments, you don't have to wait for me to verify.

Experimental implementation of prometheus/proposals#39 Previous (unmerged) experiments: * main...dashpole:prometheus:type_and_unit_labels * #16025 Signed-off-by: bwplotka <[email protected]>

Experimental implementation of prometheus/proposals#39 Previous (unmerged) experiments: * main...dashpole:prometheus:type_and_unit_labels * #16025 Signed-off-by: bwplotka <[email protected]> feature: type-and-unit-labels (extended MetricIdentity) Experimental implementation of prometheus/proposals#39 Previous (unmerged) experiments: * main...dashpole:prometheus:type_and_unit_labels * #16025 Signed-off-by: bwplotka <[email protected]>

* feature: type-and-unit-labels (extended MetricIdentity) Experimental implementation of prometheus/proposals#39 Previous (unmerged) experiments: * main...dashpole:prometheus:type_and_unit_labels * #16025 Signed-off-by: bwplotka <[email protected]> feature: type-and-unit-labels (extended MetricIdentity) Experimental implementation of prometheus/proposals#39 Previous (unmerged) experiments: * main...dashpole:prometheus:type_and_unit_labels * #16025 Signed-off-by: bwplotka <[email protected]> * Fix compilation errors Signed-off-by: Arthur Silva Sens <[email protected]> Lint Signed-off-by: Arthur Silva Sens <[email protected]> Revert change made to protobuf 'Accept' header Signed-off-by: Arthur Silva Sens <[email protected]> Fix compilation errors for 'dedupelabels' tag Signed-off-by: Arthur Silva Sens <[email protected]> * Rectored into schema.Metadata Signed-off-by: bwplotka <[email protected]> * texparse: Added tests for PromParse Signed-off-by: bwplotka <[email protected]> * add OM tests. Signed-off-by: bwplotka <[email protected]> * add proto tests Signed-off-by: bwplotka <[email protected]> * Addressed comments. Signed-off-by: bwplotka <[email protected]> * add schema label tests. Signed-off-by: bwplotka <[email protected]> * addressed comments. Signed-off-by: bwplotka <[email protected]> * fix tests. Signed-off-by: bwplotka <[email protected]> * add promql tests. Signed-off-by: bwplotka <[email protected]> * lint Signed-off-by: bwplotka <[email protected]> * Addressed comments. Signed-off-by: bwplotka <[email protected]> --------- Signed-off-by: bwplotka <[email protected]> Signed-off-by: Arthur Silva Sens <[email protected]> Co-authored-by: Arthur Silva Sens <[email protected]>

Proposal for type and unit metadata labels

25305b0

Signed-off-by: David Ashpole <[email protected]>

dashpole force-pushed the type_and_unit_labels branch from 8aab401 to 25305b0 Compare October 24, 2024 15:21

ArthurSens reviewed Oct 28, 2024

View reviewed changes

address feedback

e56b2be

Signed-off-by: David Ashpole <[email protected]>

dashpole force-pushed the type_and_unit_labels branch from 98a86bd to e56b2be Compare November 1, 2024 15:35

beorn7 reviewed Nov 7, 2024

View reviewed changes

dashpole added 2 commits November 12, 2024 20:49

add goal of warning on inappropriate operations

6cfdbd4

Signed-off-by: David Ashpole <[email protected]>

info annotation, rather than warning

de68600

Signed-off-by: David Ashpole <[email protected]>

remove dmention of unit in text format

e0d62d3

Signed-off-by: David Ashpole <[email protected]>

alexgreenbank mentioned this pull request Nov 19, 2024

Return UNIT information alongside queries prometheus/prometheus#12294

Open

bwplotka reviewed Nov 25, 2024

View reviewed changes

dashpole added 2 commits November 25, 2024 21:43

add native histogram motivation, and omit unit alternative

619537d

Signed-off-by: David Ashpole <[email protected]>

rename feature flag to identifying-type-and-unit

a2e5ed9

Signed-off-by: David Ashpole <[email protected]>

dashpole force-pushed the type_and_unit_labels branch from 15b0c44 to a2e5ed9 Compare November 26, 2024 02:52

dashpole added 4 commits November 26, 2024 03:12

add complex value alternative

75ccddd

Signed-off-by: David Ashpole <[email protected]>

add ability to relabel type and unit

d9e0ad3

Signed-off-by: David Ashpole <[email protected]>

preserve existing behavior for PromQL operations

f1760bd

Signed-off-by: David Ashpole <[email protected]>

add future extension to handle type and unit in promql operations

bce6d9c

Signed-off-by: David Ashpole <[email protected]>

dashpole force-pushed the type_and_unit_labels branch from d4ecde7 to bce6d9c Compare November 26, 2024 03:46

ArthurSens approved these changes Apr 14, 2025

View reviewed changes

ArthurSens mentioned this pull request Apr 16, 2025

feat: Support 'NoTranslation' mode in OTLP endpoint prometheus/prometheus#16441

Merged

krajorama approved these changes Apr 17, 2025

View reviewed changes

proposals/2024-09-25_metadata-labels.md Outdated Show resolved Hide resolved

Apply suggestions from code review

ddb070a

Co-authored-by: Carrie Edwards <[email protected]> Signed-off-by: Arthur Silva Sens <[email protected]>

fionaliao approved these changes Apr 18, 2025

View reviewed changes

beorn7 previously requested changes Apr 22, 2025

View reviewed changes

bwplotka and others added 2 commits April 23, 2025 11:00

Apply suggestions from code review

e418574

Co-authored-by: Björn Rabenstein <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]>

Addressing comments.

057dfb8

Signed-off-by: bwplotka <[email protected]>

bwplotka requested review from beorn7, aknuds1, carrieedwards, ArthurSens and fionaliao April 23, 2025 10:08

ArthurSens approved these changes Apr 23, 2025

View reviewed changes

proposals/2024-09-25_metadata-labels.md Outdated Show resolved Hide resolved

Update proposals/2024-09-25_metadata-labels.md

8b4b61a

Co-authored-by: Arthur Silva Sens <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]>

bwplotka reviewed Apr 24, 2025

View reviewed changes

proposals/2024-09-25_metadata-labels.md Outdated Show resolved Hide resolved

Update proposals/2024-09-25_metadata-labels.md

821ce17

Signed-off-by: Bartlomiej Plotka <[email protected]>

bwplotka merged commit 0c31275 into prometheus:main Apr 24, 2025
2 checks passed

ArthurSens mentioned this pull request Apr 28, 2025

Promote prometheusremotewriteexporter to Stable. open-telemetry/opentelemetry-collector-contrib#39706

Open

19 tasks

dashpole deleted the type_and_unit_labels branch May 14, 2025 14:41

ArthurSens mentioned this pull request May 14, 2025

Update Prometheus spec based on new updates in upstream Prometheus open-telemetry/opentelemetry-specification#4494

Open

5 tasks

bwplotka mentioned this pull request May 17, 2025

[meta] PROM-39 type-and-unit-labels stability prometheus/prometheus#16610

Open

6 tasks

Proposal: Add type and unit metadata labels #39

Proposal: Add type and unit metadata labels #39

Uh oh!

Conversation

dashpole commented Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Querying for __unit__

Uh oh!

ArthurSens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

beorn7 commented Nov 12, 2024

Uh oh!

bwplotka commented Nov 12, 2024

Uh oh!

bwplotka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dashpole commented Nov 26, 2024

Uh oh!

beorn7 commented Nov 26, 2024

Uh oh!

ArthurSens commented Nov 26, 2024

Uh oh!

beorn7 commented Nov 27, 2024

Uh oh!

bwplotka commented Nov 27, 2024

Uh oh!

beorn7 commented Nov 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bwplotka commented Nov 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beorn7 commented Nov 27, 2024

Uh oh!

bwplotka commented Nov 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurSens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

krajorama left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fionaliao left a comment

Choose a reason for hiding this comment

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dashpole commented Oct 24, 2024 •

edited

Loading

Querying for `unit`

beorn7 commented Nov 27, 2024 •

edited

Loading

bwplotka commented Nov 27, 2024 •

edited

Loading

bwplotka commented Nov 28, 2024 •

edited

Loading

bwplotka commented Apr 23, 2025 •

edited

Loading