`storage`: book keep `dirty_ratio` in `disk_log_impl` #24649

WillemKauf · 2024-12-23T21:14:13Z

The dirty ratio of a log is defined as the ratio between the number of bytes in "dirty" segments and the total number of bytes in closed segments.

Dirty segments are closed segments which have not yet been cleanly compacted- i.e, duplicates for keys in this segment could be found in the prefix of the log up to this segment.

Add book-keeping to disk_log_impl in order to cache both _dirty_segment_bytes as well as _closed_segment_bytes, which allows us to calculate the dirty ratio, and add observability for it in storage::probe.

In the future, this could be used in combination with a compaction configuration a la min.cleanable.dirty.ratio to schedule compaction.

Backports Required

Release Notes

Improvements

Adds the observable metrics dirty_segment_bytes and closed_segment_bytes to the storage layer.

src/v/storage/disk_log_impl.h

vbotbuildovich · 2024-12-24T01:03:32Z

CI test results

test results on build#60092

test_id	test_kind	job_url	test_status	passed
gtest_raft_rpunit.gtest_raft_rpunit	unit	https://buildkite.com/redpanda/redpanda/builds/60092#0193f576-772e-4d29-8e01-3035f1e9661c	FLAKY	1/2

test results on build#60241

test_id	test_kind	job_url	test_status	passed
rptest.transactions.tx_atomic_produce_consume_test.TxAtomicProduceConsumeTest.test_basic_tx_consumer_transform_produce.with_failures=True	ducktape	https://buildkite.com/redpanda/redpanda/builds/60241#019429a2-9e8f-413c-87cf-33a7c04d8757	FAIL	0/1

test results on build#60591

test_id	test_kind	job_url	test_status	passed
rm_stm_tests_rpunit.rm_stm_tests_rpunit	unit	https://buildkite.com/redpanda/redpanda/builds/60591#01945072-b1aa-48fd-bf2e-574626a0b62c	FLAKY	1/2
rptest.tests.partition_reassignments_test.PartitionReassignmentsTest.test_reassignments_kafka_cli	ducktape	https://buildkite.com/redpanda/redpanda/builds/60591#019450bf-4a4b-4d89-97f6-141851e1303f	FLAKY	4/6

vbotbuildovich · 2025-01-03T02:42:21Z

Retry command for Build#60241

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/transactions/tx_atomic_produce_consume_test.py::TxAtomicProduceConsumeTest.test_basic_tx_consumer_transform_produce@{"with_failures":true}

dotnwat

this is awesome.

do we already have metric that measures the actual size reduction achieved by compaction for a partition? what i'm thinking about is future scheduling: if a partition with the highest dirty ratio has all unique keys, then compaction won't help at all, and it might be useful for some other partition with lots of duplicates to be compacted first. using the achieved compaction ratio for the last round might be good enough as a proxy.
the other thing i was going to ask is: there are many places where compacted bytes and dirty bytes are updated. is this error prone compared to running a for-loop to compute the metrics on demand in the probe handler? is that loop over the segments slow enough that we should consider avoiding it like in this pr?

WillemKauf · 2025-01-04T18:50:08Z

do we already have metric that measures the actual size reduction achieved by compaction for a partition

We do bookkeep the compaction ratio already in the log. You raise a good point, that maybe the dirty ratio on its own is not the best heuristic for scheduling (given the fact that key cardinality isn't considered).

My question would then be: do we want parity with Kafka here, or do we want to try something new with the compaction ratio as a scheduling heuristic?

the other thing i was going to ask is: there are many places where compacted bytes and dirty bytes are updated. is this error prone compared to running a for-loop to compute the metrics on demand in the probe handler

Is it error prone- I don't think so, as far as bytes removed/added go I think I've covered all the code locations where this may happen in disk_log_impl (segment removal, compaction removed bytes, segment rolls). I do share your concern that if a byte-affecting location was missed that these metrics would begin to drift from their true values.

So, I would agree with you that an on-demand for loop over the segments would be most verifiably accurate/up to date for reporting these metrics.

dotnwat · 2025-01-05T02:07:03Z

My question would then be: do we want parity with Kafka here, or do we want to try something new with the compaction ratio as a scheduling heuristic?

I think the dirty ratio still makes sense to track, and like you mention, we can figure out scheduling later. my instinct is that dirty ratio probably is good enough, but if the scheduling goal is simply to process the biggest bang for the buck partition first then taking into account the estimate actual ratio of compaction (in combination with dirty ratio) would seem to make sense. maybe it doesn't make a big difference in real world data.

I do share your concern that if a byte-affecting location was missed that these metrics would begin to drift from their true values.

yeh. it's probably not something we need to be too concerned with since there really aren't many cases where we need to update the probe. we should be able to catch new cases in review.

andrwng · 2025-01-10T08:03:34Z

src/v/storage/disk_log_impl.cc

+template<typename tag>
+void disk_log_impl::update_dirty_segment_bytes(uint64_t bytes) {
+    apply_tagged_operation<tag>(_dirty_segment_bytes, bytes);
+    _probe->set_dirty_segment_bytes(_dirty_segment_bytes);
+}
+
+template<typename tag>
+void disk_log_impl::update_closed_segment_bytes(uint64_t bytes) {
+    apply_tagged_operation<tag>(_closed_segment_bytes, bytes);
+    _probe->set_closed_segment_bytes(_closed_segment_bytes);
+}


This seems overly complicated for something that could be

void disk_log_impl::add_dirty_segment_bytes(uint64_t bytes) { _dirty_segment_bytes += bytes; _probe->set_dirty_segment_bytes(_dirty_segment_bytes); } void disk_log_impl::add_closed_segment_bytes(uint64_t bytes) { _closed_segment_bytes += bytes; _probe->set_closed_segment_bytes(_closed_segment_bytes); } void disk_log_impl::subtract_dirty_segment_bytes(uint64_t bytes) { _dirty_segment_bytes -= std::min(bytes, _dirty_segment_bytes); _probe->set_dirty_segment_bytes(_dirty_segment_bytes); } void disk_log_impl::subtract_closed_segment_bytes(uint64_t bytes) { _closed_segment_bytes -= std::min(bytes, _closed_segment_bytes); _probe->set_closed_segment_bytes(_closed_segment_bytes); }

Yeah, it could be split into these 4 functions, but this is the very boilerplate I was seeking to avoid in the current solution.

I could see an argument for the simple boilerplate over the complexity of the template solution, though I'm of the opinion it is not that much of a complexity add.

If you much prefer the boilerplate, I'd be happy to change it.

My preference is boilerplate over templates when it's really simple to do so. Maybe i'd feel differently if there were more than two options (add/subtract)

Removed template solution in place of boilerplate.

src/v/storage/disk_log_impl.cc

tests/rptest/tests/log_compaction_test.py

WillemKauf · 2025-01-10T13:40:33Z

Force push to:

Fix accidentally moved @skip_debug_mode decorator in log_compaction_test.py

Adds the book-keeping variables `_dirty/closed_segment_bytes` to `disk_log_impl`, as well as some getter/setter functions. These functions will be used throughout `disk_log_impl` where required (segment rolling, compaction, segment eviction) to track the bytes contained in dirty and closed segments.

Uses the added functions `update_dirty/closed_segment_bytes()` in the required locations within `disk_log_impl` in order to bookkeep the dirty ratio. Bytes can be either removed or added by rolling new segments, compaction, and retention enforcement.

WillemKauf · 2025-01-10T17:42:00Z

Force push to:

Remove template solution for subtract_tag{}, add_tag{} and add boilerplate functions add/subtract_dirty/closed_segment_bytes()

storage: add dirty/closed_segment_bytes to probe

d19459c

WillemKauf requested review from dotnwat, nvartolomei and andrwng December 23, 2024 21:14

github-actions bot added the area/redpanda label Dec 23, 2024

WillemKauf force-pushed the dirty_ratio_compaction branch from c2a0252 to 018f7a2 Compare December 23, 2024 21:37

WillemKauf commented Dec 23, 2024

View reviewed changes

src/v/storage/disk_log_impl.h Outdated Show resolved Hide resolved

WillemKauf removed request for dotnwat, nvartolomei and andrwng December 27, 2024 23:26

WillemKauf force-pushed the dirty_ratio_compaction branch from 018f7a2 to 7bd15b2 Compare January 2, 2025 23:05

WillemKauf requested review from dotnwat and andrwng January 2, 2025 23:06

dotnwat previously approved these changes Jan 4, 2025

View reviewed changes

andrwng reviewed Jan 10, 2025

View reviewed changes

WillemKauf dismissed dotnwat’s stale review via 4cdfca5 January 10, 2025 13:40

WillemKauf force-pushed the dirty_ratio_compaction branch from 7bd15b2 to 4cdfca5 Compare January 10, 2025 13:40

WillemKauf added 4 commits January 10, 2025 12:40

storage: add dirty_ratio test to storage_e2e_test.cc

dfef239

rptest: add dirty ratio testing to log_compaction_test.py

e62dee6

WillemKauf force-pushed the dirty_ratio_compaction branch from 4cdfca5 to e62dee6 Compare January 10, 2025 17:41

WillemKauf requested a review from andrwng January 10, 2025 19:48

andrwng approved these changes Jan 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`storage`: book keep `dirty_ratio` in `disk_log_impl` #24649

`storage`: book keep `dirty_ratio` in `disk_log_impl` #24649

WillemKauf commented Dec 23, 2024 •

edited

Loading

vbotbuildovich commented Dec 24, 2024 •

edited

Loading

vbotbuildovich commented Jan 3, 2025

dotnwat left a comment •

edited

Loading

WillemKauf commented Jan 4, 2025

dotnwat commented Jan 5, 2025

andrwng Jan 10, 2025

WillemKauf Jan 10, 2025

andrwng Jan 10, 2025

WillemKauf Jan 10, 2025

WillemKauf commented Jan 10, 2025

WillemKauf commented Jan 10, 2025

storage: book keep dirty_ratio in disk_log_impl #24649

Are you sure you want to change the base?

storage: book keep dirty_ratio in disk_log_impl #24649

Conversation

WillemKauf commented Dec 23, 2024 • edited Loading

Backports Required

Release Notes

Improvements

vbotbuildovich commented Dec 24, 2024 • edited Loading

CI test results

vbotbuildovich commented Jan 3, 2025

Retry command for Build#60241

dotnwat left a comment • edited Loading

Choose a reason for hiding this comment

WillemKauf commented Jan 4, 2025

dotnwat commented Jan 5, 2025

andrwng Jan 10, 2025

Choose a reason for hiding this comment

WillemKauf Jan 10, 2025

Choose a reason for hiding this comment

andrwng Jan 10, 2025

Choose a reason for hiding this comment

WillemKauf Jan 10, 2025

Choose a reason for hiding this comment

WillemKauf commented Jan 10, 2025

WillemKauf commented Jan 10, 2025

`storage`: book keep `dirty_ratio` in `disk_log_impl` #24649

`storage`: book keep `dirty_ratio` in `disk_log_impl` #24649

WillemKauf commented Dec 23, 2024 •

edited

Loading

vbotbuildovich commented Dec 24, 2024 •

edited

Loading

dotnwat left a comment •

edited

Loading