Reduce memory for active series custom trackers #5853

zenador · 2023-08-29T08:09:36Z

What this PR does

Reduce memory used by active series custom trackers by using maps (sparse) instead of slices. Aim is to at least reduce it for native histograms, but experimenting with applying the same change to normal active series as well, benchmark results below.

Which issue(s) this PR fixes or relates to

Fixes #5812

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

pracucci · 2023-08-29T08:19:12Z

pkg/ingester/activeseries/active_series.go

-	activeNativeHistogramBuckets         uint32   // Number of buckets in active native histogram entries in this stripe. Only decreased during purge or clear.
-	activeMatchingNativeHistogramBuckets []uint32 // Number of buckets in active native histogram entries in this stripe matching each matcher of the configured Matchers.
+	active                               uint32            // Number of active entries in this stripe. Only decreased during purge or clear.
+	activeMatching                       map[uint16]uint32 // Number of active entries in this stripe matching each matcher of the configured Matchers.


In the past we didn't use a map because it would be significantly slower to access than slices, if there are few elements. I'm very curious to see the benchmarks for different scenarios (no custom trackers, few custom trackers, many custom trackers).

Actually in the past we did use a map, and then we switched to a slice. The reason behind this is that usually a series only matches between zero and one tracker (the trackers are usually the integration where they come from) with a few more in some use cases, but never many.

Edit: oh, I see we're not talking about specific series here but about the stripes, in which case you're right, an array is much faster to access, while the gain is probably not that big. A customer with 100 custom matchers would use 4 * 100 * 128 = 50Kb for all stripes.

The motivation for this is #5318 (comment) and https://raintank-corp.slack.com/archives/C04AT2ANL90/p1690989928160289

Before @bboreham 's recent change in #5665 it was taking up a lot of memory, and even afterwards we are still tripling the amount needed as we are tracking native histogram series and buckets as well. Discussed with @krajorama and we do need to track those, and it makes sense to use the same set of custom matchers for float series and native histogram series, so we are exploring other ways to minimise the memory usage especially in cases where native histograms are not being used. So we could just implement the change for native histograms and leave the float active series tracking alone, but we thought we might as well explore the possibility of making them consistent.

Benchmark results:
compare_BenchmarkActiveSeries_UpdateSeries$.txt
compare_BenchmarkMatchesSeries.txt

Note in your benchmarks that UpdateSeries is 5% slower in meaningful cases (when there are series, with or without matchers) while there's no memory improvement for the meaningful cases (there's only improvement when there are no series, saving 150Ki, same order of magnitude that I made above, but when there are 100K series the saving is 0.01% (because it's constant, doesn't depend on the series amount). There's also an increase in allocations, which will slow down the process more (as it generates more garbage to collect).

I wouldn't trade a 5% speed (which stretches with number of series) for 150Ki of constant memory.

pracucci · 2023-10-10T14:38:48Z

Closing for now due to inactivity, but feel free to re-open it anytime if you plan to get back to this. Thanks!.

zenador requested a review from a team as a code owner August 29, 2023 08:09

zenador changed the title ~~educe mem native histogram custom trackers~~ Reduce mem native histogram custom trackers Aug 29, 2023

zenador marked this pull request as draft August 29, 2023 08:10

zenador changed the title ~~Reduce mem native histogram custom trackers~~ Reduce memory for native histogram active series/buckets custom trackers Aug 29, 2023

zenador changed the title ~~Reduce memory for native histogram active series/buckets custom trackers~~ Reduce memory for active series custom trackers Aug 29, 2023

pracucci reviewed Aug 29, 2023

View reviewed changes

zenador added 3 commits August 30, 2023 01:43

Improve BenchmarkMatchesSeries

dc01c92

Change native histogram custom trackers into maps

fa984d2

Change float series custom trackers into maps as well

278add9

zenador force-pushed the zenador/reduce-mem-native-histogram-custom-trackers branch from 2a852bb to 278add9 Compare August 29, 2023 17:46

pracucci closed this Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory for active series custom trackers #5853

Reduce memory for active series custom trackers #5853

zenador commented Aug 29, 2023 •

edited

Loading

pracucci Aug 29, 2023

colega Aug 29, 2023 •

edited

Loading

zenador Aug 29, 2023

colega Aug 31, 2023 •

edited

Loading

pracucci commented Oct 10, 2023

Reduce memory for active series custom trackers #5853

Reduce memory for active series custom trackers #5853

Conversation

zenador commented Aug 29, 2023 • edited Loading

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

pracucci Aug 29, 2023

Choose a reason for hiding this comment

colega Aug 29, 2023 • edited Loading

Choose a reason for hiding this comment

zenador Aug 29, 2023

Choose a reason for hiding this comment

colega Aug 31, 2023 • edited Loading

Choose a reason for hiding this comment

pracucci commented Oct 10, 2023

zenador commented Aug 29, 2023 •

edited

Loading

colega Aug 29, 2023 •

edited

Loading

colega Aug 31, 2023 •

edited

Loading