-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snapshot pool integration, first version #673
Snapshot pool integration, first version #673
Conversation
a1ab273
to
fd908d3
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #673 +/- ##
==========================================
+ Coverage 77.84% 77.87% +0.02%
==========================================
Files 197 197
Lines 27761 27953 +192
Branches 5493 5529 +36
==========================================
+ Hits 21611 21767 +156
- Misses 4333 4346 +13
- Partials 1817 1840 +23 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first series of comments to test_snapshot.c. review is in progress.
src/raft/recv_install_snapshot.c
Outdated
if (sm_state(&leader->rpc.sm) != RPC_FILLED) { | ||
return false; | ||
} | ||
return state == LS_PAGE_READ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's worth defining a macro/function IN()
with the following semantics: IN(state, (LS_PAGE_READ, ..., LS_CHECK_F_HAS_SIGS))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the implementation locally to use the IN
macro and the only problem is that the macro might be difficult to read and debug, thoughts?
#define IN_1(E, X) E == X
#define IN_2(E, X, ...) E == X || IN_1(E,__VA_ARGS__)
#define IN_3(E, X, ...) E == X || IN_2(E,__VA_ARGS__)
#define IN_4(E, X, ...) E == X || IN_3(E,__VA_ARGS__)
#define IN_5(E, X, ...) E == X || IN_4(E,__VA_ARGS__)
#define IN_6(E, X, ...) E == X || IN_5(E,__VA_ARGS__)
#define IN_7(E, X, ...) E == X || IN_6(E,__VA_ARGS__)
#define IN_8(E, X, ...) E == X || IN_7(E,__VA_ARGS__)
#define IN_9(E, X, ...) E == X || IN_8(E,__VA_ARGS__)
#define GET_IN_MACRO(_1,_2,_3,_4,_5,_6,_7,_8,_9,NAME,...) NAME
#define IN(E, ...) \
GET_IN_MACRO(__VA_ARGS__,IN_9,IN_8,IN_7,IN_6,IN_5,IN_4,IN_3,IN_2,IN_1)(E,__VA_ARGS__)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss this with @cole-miller
/* Decorates the callback used when the pool work is done to set the test | ||
* fixture flag to true, then calls the original callback.*/ | ||
static void test_fixture_work_cb(pool_work_t *w) { | ||
global_fixture.work_done = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a good time to invent an implementation of is_main_thread()
macro or a function is much closer than we thought... When I looked an wait_work()
it wasn't obvious that the spin-lock runs in the same thread with test_fixture_work_cb()
.
Please put a // PRE(is_main_thread())
to both functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will also add some comments to the global_fixture
and these functions.
test/raft/unit/test_snapshot.c
Outdated
data->cb(data->s, 0); | ||
} | ||
|
||
int uv_sender_send_op(struct sender *s, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not mistaken, uv_sender_*()
functions can be static
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always forget static
and const
, always.
test/raft/unit/test_snapshot.c
Outdated
.ht_create = pool_ht_create_op, | ||
.work_queue = pool_work_queue_op, | ||
.sender_send = uv_sender_send_op, | ||
.is_main_thread = pool_is_main_thread_op, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better not to override is_main_thread()
.
You have an implementation of pool_is_main_thread_op()
Maybe, it's worth thinking about an implementation designed around uv_key_create()
. We need to check if it's a good idea to mix __tls
and uv_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is __tls
? EDIT: probably "thread-local-storage".
MUNIT_UNUSED void *user_data) | ||
{ | ||
/* Prevent hangs. */ | ||
alarm(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so if you do some changes and the tests breaks in a way that makes wait_work
or wait_msg_sent
stall, then the execution is terminated after 2s and the test fails but you can still see the results of the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this patch.
src/raft/recv_install_snapshot.c
Outdated
if (sm_state(&leader->rpc.sm) != RPC_FILLED) { | ||
return false; | ||
} | ||
return state == LS_PAGE_READ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss this with @cole-miller
PRE(incoming != M_MSG_SENT && incoming != M_TIMEOUT && | ||
incoming != M_WORK_DONE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need it here?
It's quite clear from the lines above that this assumption is obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason is that from this line on this is a valid pointer. If you were to do msg->type
on M_MSG_SENT
, or any of the others, the program would crash.
if (UNLIKELY(rc != 0)) { | ||
sm_move(&rpc->sm, RPC_ERROR); | ||
return; | ||
} | ||
|
||
follower_tick(follower, M_MSG_SENT); | ||
} | ||
|
||
static bool is_a_trigger_leader(const struct leader *leader, const struct raft_message *incoming) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it seem less noisy? What do you think about such a style?
static bool is_a_trigger_leader(const struct leader *leader, const struct raft_message *incoming)
{
int state = sm_state(&leader->sm);
return
ergo(incoming == M_WORK_DONE,
IN(state, (LS_HT_WAIT, LS_PAGE_SENT,
LS_PERSISTED_SIG_PART, LS_PAGE_READ))) &&
ergo(incoming == M_MSG_SENT,
sm_state(&leader->rpc.sm) != RPC_FILLED &&
IN(state, (LS_PAGE_READ, LS_SNAP_DONE, LS_F_NEEDS_SNAP,
LS_REQ_SIG_LOOP, LS_CHECK_F_HAS_SIGS))) &&
ergo(incoming == M_TIMEOUT,
IN(state, (LS_PAGE_READ, LS_SNAP_DONE, LS_F_NEEDS_SNAP,
LS_REQ_SIG_LOOP, LS_CHECK_F_HAS_SIGS,
LS_WAIT_SIGS))) &&
ergo(state == LS_F_ONLINE,
incoming->type == RAFT_IO_APPEND_ENTRIES_RESULT) &&
ergo(sm_state(&leader->rpc.sm) != RPC_SENT, false) &&
ergo(state == LS_PAGE_READ,
IN(incoming->type, (RAFT_IO_INSTALL_SNAPSHOT_CP_RESULT
RAFT_IO_INSTALL_SNAPSHOT_MV_RESULT))) &&
ergo(IN(state, (LS_SNAP_DONE, LS_F_NEEDS_SNAP)),
incoming->type == RAFT_IO_INSTALL_SNAPSHOT_RESULT) &&
ergo(IN(state, (LS_REQ_SIG_LOOP, LS_CHECK_F_HAS_SIGS)),
incoming->type == RAFT_IO_SIGNATURE_RESULT);
}
cc @cole-miller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going for this style I would prefer to have two distinct blocks because in the first three ergos
the pointer is not valid M_WORK_DONE
, M_MSG_SENT
and M_TIMEOUT
are equal to 1
, 2
and 3
, which means that doing incoming->type
will crash the program.
My fear is that when we come back and do changes in the future we miss that bit about the ordering because it is not explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments after reading through. This PR is really well-organized, nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is coming along nicely!
1fc2144
to
2ce00ea
Compare
2ce00ea
to
defc3f3
Compare
This PR integrates the pool with the snapshot installation. This is a first version without the real implementation for the callbacks and I/O. It solves a few issues found along the way, namely it includes:
See the commit list for more information as the functionality is split into self-contained units.