-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage
: book keep dirty_ratio
in disk_log_impl
#24649
base: dev
Are you sure you want to change the base?
Conversation
c2a0252
to
018f7a2
Compare
CI test resultstest results on build#60092
test results on build#60241
test results on build#60591
|
018f7a2
to
7bd15b2
Compare
Retry command for Build#60241please wait until all jobs are finished before running the slash command
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is awesome.
-
do we already have metric that measures the actual size reduction achieved by compaction for a partition? what i'm thinking about is future scheduling: if a partition with the highest dirty ratio has all unique keys, then compaction won't help at all, and it might be useful for some other partition with lots of duplicates to be compacted first. using the achieved compaction ratio for the last round might be good enough as a proxy.
-
the other thing i was going to ask is: there are many places where compacted bytes and dirty bytes are updated. is this error prone compared to running a for-loop to compute the metrics on demand in the probe handler? is that loop over the segments slow enough that we should consider avoiding it like in this pr?
We do bookkeep the compaction ratio already in the log. You raise a good point, that maybe the dirty ratio on its own is not the best heuristic for scheduling (given the fact that key cardinality isn't considered). My question would then be: do we want parity with Kafka here, or do we want to try something new with the compaction ratio as a scheduling heuristic?
Is it error prone- I don't think so, as far as bytes removed/added go I think I've covered all the code locations where this may happen in disk_log_impl (segment removal, compaction removed bytes, segment rolls). I do share your concern that if a byte-affecting location was missed that these metrics would begin to drift from their true values. So, I would agree with you that an on-demand for loop over the segments would be most verifiably accurate/up to date for reporting these metrics. |
I think the dirty ratio still makes sense to track, and like you mention, we can figure out scheduling later. my instinct is that dirty ratio probably is good enough, but if the scheduling goal is simply to process the biggest bang for the buck partition first then taking into account the estimate actual ratio of compaction (in combination with dirty ratio) would seem to make sense. maybe it doesn't make a big difference in real world data.
yeh. it's probably not something we need to be too concerned with since there really aren't many cases where we need to update the probe. we should be able to catch new cases in review. |
src/v/storage/disk_log_impl.cc
Outdated
template<typename tag> | ||
void disk_log_impl::update_dirty_segment_bytes(uint64_t bytes) { | ||
apply_tagged_operation<tag>(_dirty_segment_bytes, bytes); | ||
_probe->set_dirty_segment_bytes(_dirty_segment_bytes); | ||
} | ||
|
||
template<typename tag> | ||
void disk_log_impl::update_closed_segment_bytes(uint64_t bytes) { | ||
apply_tagged_operation<tag>(_closed_segment_bytes, bytes); | ||
_probe->set_closed_segment_bytes(_closed_segment_bytes); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems overly complicated for something that could be
void disk_log_impl::add_dirty_segment_bytes(uint64_t bytes) {
_dirty_segment_bytes += bytes;
_probe->set_dirty_segment_bytes(_dirty_segment_bytes);
}
void disk_log_impl::add_closed_segment_bytes(uint64_t bytes) {
_closed_segment_bytes += bytes;
_probe->set_closed_segment_bytes(_closed_segment_bytes);
}
void disk_log_impl::subtract_dirty_segment_bytes(uint64_t bytes) {
_dirty_segment_bytes -= std::min(bytes, _dirty_segment_bytes);
_probe->set_dirty_segment_bytes(_dirty_segment_bytes);
}
void disk_log_impl::subtract_closed_segment_bytes(uint64_t bytes) {
_closed_segment_bytes -= std::min(bytes, _closed_segment_bytes);
_probe->set_closed_segment_bytes(_closed_segment_bytes);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it could be split into these 4 functions, but this is the very boilerplate I was seeking to avoid in the current solution.
I could see an argument for the simple boilerplate over the complexity of the template solution, though I'm of the opinion it is not that much of a complexity add.
If you much prefer the boilerplate, I'd be happy to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is boilerplate over templates when it's really simple to do so. Maybe i'd feel differently if there were more than two options (add/subtract)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed template solution in place of boilerplate.
7bd15b2
to
4cdfca5
Compare
Force push to:
|
Adds the book-keeping variables `_dirty/closed_segment_bytes` to `disk_log_impl`, as well as some getter/setter functions. These functions will be used throughout `disk_log_impl` where required (segment rolling, compaction, segment eviction) to track the bytes contained in dirty and closed segments.
Uses the added functions `update_dirty/closed_segment_bytes()` in the required locations within `disk_log_impl` in order to bookkeep the dirty ratio. Bytes can be either removed or added by rolling new segments, compaction, and retention enforcement.
4cdfca5
to
e62dee6
Compare
Force push to:
|
The dirty ratio of a log is defined as the ratio between the number of bytes in "dirty" segments and the total number of bytes in closed segments.
Dirty segments are closed segments which have not yet been cleanly compacted- i.e, duplicates for keys in this segment could be found in the prefix of the log up to this segment.
Add book-keeping to
disk_log_impl
in order to cache both_dirty_segment_bytes
as well as_closed_segment_bytes
, which allows us to calculate the dirty ratio, and add observability for it instorage::probe
.In the future, this could be used in combination with a compaction configuration a la min.cleanable.dirty.ratio to schedule compaction.
Backports Required
Release Notes
Improvements
dirty_segment_bytes
andclosed_segment_bytes
to the storage layer.