-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replication Snapshot and Compact #368
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #368 +/- ##
==========================================
+ Coverage 57.76% 57.84% +0.07%
==========================================
Files 108 107 -1
Lines 9519 9563 +44
Branches 1233 1231 -2
==========================================
+ Hits 5499 5532 +33
- Misses 3481 3493 +12
+ Partials 539 538 -1 ☔ View full report in Codecov by Sentry. |
@@ -106,7 +106,7 @@ void LogStoreService::start(bool format) { | |||
} | |||
|
|||
void LogStoreService::stop() { | |||
device_truncate(nullptr, true, false); | |||
// device_truncate(nullptr, true, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So device_truncate() is called only from resource mgr. Should we call even if there is no resource crunch issue ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this device_truncate is calling into log_dev to do actual truncate, we should not do that without setting the compact_lsn in logstore (through nuraft::compact) because only nuraft knows which should be the correct compact_lsn. We already persisted the compact_lsn in superblock, in reboot, it can catch up in next timer schedule or next critical alert is hit. I don't quite get what is the 'crunch issue' here in your comment.
res_mgr_timer_ms * 1000 * 1000, true /* recurring */, nullptr /* cookie */, iomgr::reactor_regex::all_worker, | ||
[this](void*) { | ||
// all resource timely audit routine should arrive here; | ||
this->trigger_truncate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The periodic execution is affecting UT too. Should we have some condition of vdev size more than some threshold, then only trigger the truncate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you specify affecting by how? Sorry I probably didn't catch the question. Right now the threshold we have is the critical one, and this one is to a batch truncate for all the log stores in homestore system wide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are some UT mostly on test_journal_vdev.cpp
carefully verify the journal offset/size etc in a controlled manner (append some log, check, append some logs, check, truncate, check).
I think the concern here is if the time based truncation introduced, UT follows this pattern is not possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the reason we didn't reserve at log_dev, but to let nuraft do the reserve is because we don't want to make changes in log_dev layer, because that will break a lot of assumptions the current UT does in test_journal_vdev and test_log_dev.
I spent more a week on that journey and finally realized to do the reserve in logdev (crying out loud), need to basically redesign all the UTs in those layer...
In summary, the change in this PR won't break any assumption in layer below log store. Multiple truncation during multiple timer event won't cause any side effect (even though no snapshot/compact in between), because they are protected by safe_truncation_boundary which is conducted by compact_lsn also (only source in nuraft::compact call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concern is min(last - reserved, companct_lsn), if nuraft havent called compact, compact_lsn is 0 and periodic timer will truncate all entries
res_mgr_timer_ms * 1000 * 1000, true /* recurring */, nullptr /* cookie */, iomgr::reactor_regex::all_worker, | ||
[this](void*) { | ||
// all resource timely audit routine should arrive here; | ||
this->trigger_truncate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are some UT mostly on test_journal_vdev.cpp
carefully verify the journal offset/size etc in a controlled manner (append some log, check, append some logs, check, truncate, check).
I think the concern here is if the time based truncation introduced, UT follows this pattern is not possible.
#endif | ||
|
||
// high watermark check for the entire journal vdev; | ||
if (resource_mgr().check_journal_vdev_size(m_vdev.used_size(), m_vdev.size())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as the m_event_cb is nullptr , how this expect to link to truncation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is via m_journal_vdev_exceed_cb
registered from here: https://github.com/yamingk/HomeStore/blob/yk_repl_new/src/lib/device/journal_vdev.cpp#L58
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, it is a bit wired as the code path here seems like
if ( DETECTION)) {
update_metrics
do callback to trigger action
}
however the m_event_cb is nullptr and the actual callback is done in the detection part((resource_mgr().check_journal_vdev_size
) .
Can I read this as your TODO to move to m_event_cb?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The goal is in resource mgr to move all layer_spcecif_registered_cb to general event_cb. Here in this journal vdev, I think the only consumer of log store and we can see if there is still any use case this event_cb is still needed since most of the job is done via resource manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
res_mgr_timer_ms * 1000 * 1000, true /* recurring */, nullptr /* cookie */, iomgr::reactor_regex::all_worker, | ||
[this](void*) { | ||
// all resource timely audit routine should arrive here; | ||
this->trigger_truncate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concern is min(last - reserved, companct_lsn), if nuraft havent called compact, compact_lsn is 0 and periodic timer will truncate all entries
Reviewers please focus on resource mgr, home raft repl log, raft repl dev and JournalVDev.
Not much logic change in other layer/files.
Changes:
Note: this is not a generic snapshot that we will implement for other use cases (e.g. HomeBlks). This snapshot creation is bear minimum for nuraft snapshot and HomeObject use case.
in HomeStore it remembers the compact_lsn and later resource manager will trigger all logstore truncation in the system.
HomeStore doesn't reserve any extra log items. Reservation is done through dynamic config which used to initialize raft server on startup.
Testing:
snapshot creation and truncation can be triggered from the logs with below command line.
./bin/test_raft_repl_dev --gtest_filter=Snapshot_and_Compact --log_mods replication:debug --num_io=999999 --snapshot_distance=200 --num_raft_logs_resv=20000 --res_mgr_audit_timer_ms=120000
it also needs to be added to replication long duration test.