-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Support exponential histograms in the Prometheus exporter #5940
Comments
Hi @fstab - what you propose would definitely simplify the code. A couple of downsides I see:
Overall, I think I favor this solution, despite its downsides. I like the simplification this brings to the prometheus exporter. It allows maintainers of this project to more easily keep it up to date without requiring as much deep expertise on the different prometheus exposition formats. IMO, we can / should optimize the OTLP exporters, and encourage any users with strict memory allocation requirements or prometheus library version requirements to switch to OTLP instead. |
Thanks @jack-berg, I'll create a PR. |
Hi @fstab - @jack-berg is helping through a big project I'm working on - which is integrating OTel in Apache Pulsar (It's a low latency messaging/streaming system). As part of this project, I'm in the implementation phase of the proposal I made to OTel to introduce a new mode for latency sensitive systems called Memory Mode: It has the normal existing one (IMMUTABLE) and a new one called REUSABLE. It's main goal is to eliminate any O(# Your suggestion worries me since it lacks the ability to support the memory mode REUSABLE. If we’re using this method in IMMUTABLE memory mode, we’re basically adding another source of O( The REUSABLE memory mode would need a different way of doing that. Aside from allocating data structures on every collection cycle, another big problem I presume is that you have to convert any The way I see it: So I think doing that proposal collides with the new memory mode which is in the middle of implementation. |
Thanks @asafm. I'm not sure if I understand your project correctly. Is your goal to expose metrics in Prometheus format for scraping? If that's the case, keep in mind that the Prometheus server will request protobuf format when exponential histograms are enabled, so as soon as we add exponential histogram support (as required by the spec) we must add support for protobuf to the Prometheus exporter. |
@asafm is trying to reduce allocations throughout the OpenTelemetry java metrics SDK pipeline. We've made a lot of progress in the SDK itself, and with some additional work in flight, allocations will be very low (near zero). The next area that needs attention would be the exporters themselves. You can imagine that if you have a large number of series, solving allocations in the SDK but not in the exporter only solves half the problem: a memory inefficient exporter may still cause many allocations while mapping to an internal representation before serialization. I believe this is the Asaf's concern with this proposed change to the prometheus exporter. The prometheus exporter implementation should currently be quite efficient from a memory standpoint, since it directly serializes MetricData to the prometheus representation without any (minimal?) additional internal representation. IIUC, this proposal would convert each MetricData to an equivalent representation from @asafm if prometheus support for native OTLP ingest continues to mature / stabilize, surely that diminishes the importance of an optimized prometheus exporter, no? |
Thanks a lot! If caching and reusing I don't find it easy to understand what effect object pools have. Garbage collection algorithms mark live objects, not unreferenced objects. So intuitively the more objects you keep alive the more work a GC has to do. But I guess this is not so straightforward in practice. Anyway, looking forward to seeing the results of the experiment, and if it works well we could implement something similar upstream in the Prometheus library. In the meantime, if there are not objections, I'm happy to create a PR. |
This effort will be a very long one I presume, not counting the entire eco-system will need to align on this - this can easily be 2 years including eco-system (M3DB, Cortex, VictoriaMetrics). As with anything in life, change is better in small iterations. Same should with this proposal. If it goes forward, it's not an incremental step as it has severe cost to users using Prometheus exporter. I suggest this proposal to be implemented after OTLP support has been added and GA across Prometheus and major ecosystem players. Again please take into account the other disadvantage of adding a dependency to OTel Java SDK as I wrote above:
@fstab The JMH benchmarked shows object pooling reducing allocation rates by 99.98%. Also, production experience with Pulsar shows with 10k - 100k Attribute sets, the latency it experience due to GC pauses is grave. If you read M3DB code for example, you'll see that they are also using object pooling quite extensively. Latency sensitive systems such as databases and streaming systems, are very sensitive to memory allocations. |
PR here: #6015 |
@fstab @jack-berg As mentioned before, this PR will kill the ability to continue with REUSABLE_DATA, which is a feature developed right now. From my perspective it's a big problem to merge it. |
Hi @asafm, I'm not really sure what you are proposing. Do you have a better way of implementing support for exponential histograms in the Prometheus endpoint, or are you proposing not to implement support for exponential histograms at all? |
@fstab Just implement it within the existing code like all the other instrument types. No external library. It's not such a big effort. |
I don't think it is possible to implement exponential histograms like all the other instrument types. Exponential histograms are fundamentally different, that's why the Prometheus community decided to introduce a new exposition format for them. I think it will be a big effort, and I believe re-implementing this in the OpenTelemetry SDK rather than using the upstream implementation will introduce a lot of additional maintenance effort but not solve any issue. |
I disagree and answered here. |
We can continue the thread on the PR |
Resolved in #6015. |
Problem
The Prometheus and OpenMetrics Compatibility Spec says:
I understand this is currently not implemented in the Prometheus exporter in the OTel Java SDK: Exponential histograms are dropped.
I would like to add OpenTelemetry Exponential Histogram to Prometheus Native Histogram conversion. As Prometheus Native histograms can only be exposed in Prometheus Protobuf format, I would also add support for Prometheus protobuf as an additional exposition format.
However, before I start working on it, I'm creating this issue. Please let me know what you think.
Proposed Solution
I'm one of the maintainers of the Prometheus Java client library, and we recently released version 1.0.0. The Prometheus Java client library is owned by the Prometheus team under CNCF governance.
The new architecture of the 1.0.0 release separates the implementation of the Prometheus data model and the Prometheus metrics library, as described on https://prometheus.github.io/client_java/internals/model/:
prometheus-metrics-core
is the metrics library. There's no need to add a dependency to this module to the Prometheus SDK.prometheus-metrics-model
are read-only immutable Prometheus metric snapshots produced during scraping.exposition formats
convert the snapshots into different exposition formats, like Prometheus text format, OpenMetrics text format, or Prometheus protobuf format.My proposal is to refactor OpenTelemetry Java's Prometheus exporter to produce Snapshots as defined in
prometheus-metrics-model
, and to use Prometheus exposition format modules to convert these snapshots to Prometheus Text format, OpenMetrics format, and Prometheus Protobuf format depending on theAccept
header in the scrape request.I'm happy to maintain the Prometheus exporter long term if you need a maintainer.
The immediate benefit is support for Native histograms.
However, there's also a long-term benefit: Prometheus exposition formats are currently in flux, there is a decision to add a new OpenMetrics format, but that isn't specified yet. However, if the OpenTelemetry Java SDK uses the exposition format modules from the Prometheus Java client library, adding support for future Prometheus exposition formats will be a trivial change, because you can just use the upstream dependency from the Prometheus project.
Let me know what you think. I'm happy to create a PR if you think this is a good idea.
The text was updated successfully, but these errors were encountered: