Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement HistogramFold plan for prometheus histogram type #2626

Merged
merged 5 commits into from
Oct 20, 2023

Conversation

waynexia
Copy link
Member

@waynexia waynexia commented Oct 19, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

Part one of prometheus histogram. The next part is about planner and quantile.

HistogramFold

HistogramFold will fold the conventional (non-native) histogram (1) for later
computing. Specifically, it will transform the le and field column into a complex
type, and samples on other tag columns:

  • le will become a [ListArray] of [f64]. With each bucket bound parsed
  • field will become a [ListArray] of [f64]
  • other columns will be sampled every bucket_num element, but their types won't change.

Due to the folding or sampling, the output rows number will become input_rows / bucket_num.

le should be the same across the entire metric. Thus it can be a Dict<List<f64>> to reduce size. However this is not viable at present because I cannot construct such a dict array 🥲

Requirement

  • Input should be sorted on <tag list>, le ASC, ts.
  • The value set of le should be same. I.e., buckets of every series should be same.

Result

It's output looks like

+--------+---------------------------------+--------------------------------+
| host   | le                              | val                            |
+--------+---------------------------------+--------------------------------+
| host_1 | [0.001, 0.1, 10.0, 1000.0, inf] | [0.0, 1.0, 1.0, 5.0, 5.0]      |
| host_1 | [0.001, 0.1, 10.0, 1000.0, inf] | [0.0, 20.0, 60.0, 70.0, 100.0] |
| host_1 | [0.001, 0.1, 10.0, 1000.0, inf] | [1.0, 1.0, 1.0, 1.0, 1.0]      |
| host_2 | [0.001, 0.1, 10.0, 1000.0, inf] | [0.0, 0.0, 0.0, 0.0, 0.0]      |
| host_2 | [0.001, 0.1, 10.0, 1000.0, inf] | [0.0, 1.0, 2.0, 3.0, 4.0]      |
+--------+---------------------------------+--------------------------------+

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
@codecov
Copy link

codecov bot commented Oct 19, 2023

Codecov Report

Merging #2626 (7554dc5) into develop (f859932) will decrease coverage by 0.56%.
Report is 21 commits behind head on develop.
The diff coverage is 60.89%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2626      +/-   ##
===========================================
- Coverage    85.34%   84.79%   -0.56%     
===========================================
  Files          737      740       +3     
  Lines       118012   119840    +1828     
===========================================
+ Hits        100722   101617     +895     
- Misses       17290    18223     +933     

Signed-off-by: Ruihang Xia <[email protected]>
Copy link
Contributor

@zhongzc zhongzc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhongzc zhongzc added this pull request to the merge queue Oct 20, 2023
Merged via the queue into GreptimeTeam:develop with commit 212ea2c Oct 20, 2023
12 checks passed
@waynexia waynexia deleted the histogram branch October 20, 2023 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants