Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8: Implement support in Ruby client library #306

Open
4 tasks
Tracked by #13095
ywwg opened this issue Mar 5, 2024 · 10 comments
Open
4 tasks
Tracked by #13095

UTF-8: Implement support in Ruby client library #306

ywwg opened this issue Mar 5, 2024 · 10 comments

Comments

@ywwg
Copy link
Member

ywwg commented Mar 5, 2024

As in prometheus/client_golang#1369 and prometheus/client_java#916, the Ruby client library needs to be updated to support UTF-8.

Tasks:

  • Add flag to enable change in validation logic to check that metric and label names are UTF-8 valid instead of the old letters/numbers/underscores/colons set
  • Update exposition format parsers for the new syntax
  • Update PromQL parsers (if any) for the new syntax
  • Update content negotiation logic

For background and references see prometheus/prometheus#13095

@Sinjo
Copy link
Member

Sinjo commented Mar 5, 2024

Hey, thanks for the heads up. I've got a rough idea in my head of how those changes will fit into the codebase.

I've given the proposal a read and one thing that stuck out to me is that grouping key labels in the pushgateway client's code aren't mentioned. Are they sticking to the old rules for now?

@beorn7
Copy link
Member

beorn7 commented Mar 5, 2024

Very good point. We haven't thought about PGW yet.
That needs more code changes, and some more ideas. sigh

@beorn7
Copy link
Member

beorn7 commented Mar 5, 2024

For now, I would assume the PGW doesn't support the new UTF-8 names yet.

@ywwg
Copy link
Member Author

ywwg commented Mar 5, 2024

We are making a note to create a design for this situation

@Sinjo
Copy link
Member

Sinjo commented Mar 5, 2024

That needs more code changes, and some more ideas. sigh

Happy to be of service 🙈

@ywwg
Copy link
Member Author

ywwg commented Sep 17, 2024

@Sinjo are you still interested in working on this? We have done work on the pushgateway so that design is resolved.

@Sinjo
Copy link
Member

Sinjo commented Sep 17, 2024

Yeah, I'd like to make some time for it. It doesn't seem like it'll be too hard to bake in at least some experimental support for it.

The biggest potential snag when it comes to our in-process storage of metrics is going to be the way DirectFileStrore serialises the metrics to disk to enable multi-process web servers to have all their processes' metrics scraped. I'll need to make sure that code handles UTF-8 label names.

One thing I haven't managed to piece together is how the content negotiation works. Right now we exclusively serve text/plain; version=0.0.4 format with no conditional behaviours.

Having had a quick glance at prometheus/common#570 it seems that looking for an extra escaping=allow-utf-8 value in the Accept header is enough to enable the new path with unescaped UTF-8 and that it's also possible to pass underscores, dots, or values to that parameter to use one of those escaping schemes. What isn't clear to me is what to do if the registry contains labels with UTF-8 characters, but no escaping parameter is passed. Is there a default escaping scheme that's been chosen for compatibility with existing Prometheus servers?

My gut says to stick all of this behind a config flag that makes it clear that all of it is experimental and could change in any minor/patch version. Does that sound right to you?

@ywwg
Copy link
Member Author

ywwg commented Sep 17, 2024

If no escaping parameter is passed, the metrics producer is expected to apply a default escaping scheme. In Go, that is determined by the NameEscapingScheme setting in common/model. That is Underscores by default but a user could change it with a configuration value or flag. (I think we haven't implemented this in Go, actually, so I'll file an issue for that :)). If the metrics producer doesn't escape names, then the metrics will simply be rejected by the metrics consumer as having invalid metric or label names.

UTF-8 will be default-on in Prometheus 3 (and is on by default in the beta) and we should aim for that in all the client libraries. But yes it's acceptable to have this behind a flag for now.

@Sinjo
Copy link
Member

Sinjo commented Sep 17, 2024

Cool, that makes sense to me. Thanks!

@ywwg
Copy link
Member Author

ywwg commented Sep 17, 2024

thank you for contributing! I know only enough ruby to be dangerous so I appreciate the help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

4 participants