-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two-node vfs2 unit test #674
Conversation
Signed-off-by: Cole Miller <[email protected]>
"Apply" is the verb we use for what happens when a raft log entry is committed, so avoid it for these functions that are associated with appending and truncating entries. Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
This happens whenever we write part of a transaction to the WAL after the first frame header. Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
We don't move to ACTIVE or HIDDEN if no frames were written, so we should stay in BASE here as well. Signed-off-by: Cole Miller <[email protected]>
vfs2_wal_frame carries the commit marker explicitly, so it doesn't match COMMAND_FRAMES, which only carries the page numbers and page data. dqlite_vfs_frame is defined with this in mind. To get the commit markers on the follower side, we start with the size of the main file in pages and take the maximum with each page number up to the commit frame in the WAL. This matches what the in-memory VFS does. Signed-off-by: Cole Miller <[email protected]>
A follower that first touches a database when it receives a COMMAND_FRAMES entry needs to initialize the page size at that point (absent a guarantee that we run `PRAGMA page_size` immediately after sqlite3_open). Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
The length of the polled transaction is returned in the WAL slice. Signed-off-by: Cole Miller <[email protected]>
By abbreviating two common operations. Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
This ties the destruction of the shared memory to that of the owning entry, because they are created together. If shm finalization is tied to a call to xShmUnmap, then it can't happen unless SQLite called xShmMap in the first place, and this is not guaranteed to happen. For example, if we start and stop a follower node without ever running a checkpoint or transaction, SQLite will never get the change to call xShmMap. Signed-off-by: Cole Miller <[email protected]>
0b5f007
to
afa657e
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #674 +/- ##
==========================================
+ Coverage 77.56% 78.15% +0.59%
==========================================
Files 197 197
Lines 27638 27962 +324
Branches 5486 5532 +46
==========================================
+ Hits 21438 21855 +417
+ Misses 4326 4262 -64
+ Partials 1874 1845 -29 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Cole this is looking very good! The new comments and the commit descriptions are super helpful and make the code much easier to follow (though you extensive explanations might have also helped :P). I left a bunch of comments but there are a lot of nitpicks and questions.
The only part that I am not sure I understand is the removal of the FLUSH state, and I couldn't find the reasoning behind that in the commit or in the code. It will probably make sense once we discuss it offline and/or once the new diagram is in place.
src/vfs2.c
Outdated
@@ -1727,6 +1727,7 @@ int vfs2_add_uncommitted(sqlite3_file *file, | |||
* (wal_cursor). */ | |||
e->shm_locks[WAL_WRITE_LOCK] = VFS2_EXCLUSIVE; | |||
|
|||
uint32_t start = e->wal_cursor; | |||
struct wal_index_full_hdr *ihdr = get_full_hdr(e); | |||
uint32_t mx = ihdr->basic[0].mxFrame; | |||
if (mx > 0 && ihdr->nBackfill == mx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Nit]: Should we add start != 0 && ...
for clarity with the condition below? That way we don't have to remember that mx < wal_cursor && mx > 0 => wal_cursor > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks, this could use refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic has been substantially re-organized in a follow-up commit (43ec47c), hopefully it's more clear now!
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
As a supplement to the observability improvements in this PR, here's a chronoscope.yaml: .p1: &tick_parser_positions
id: 10
pid: 8
type: 6
time: 1
event: 11
.p2: &rel_parser_positions
type: 6
orig_pid: 8
dest_pid: 10
orig_id: 12
dest_id: 14
.p3: &attr_parser_positions
id: 7
pid: 5
type: 3
name: 8
value: 1
tick:
- type: wtx
pos: *tick_parser_positions
- type: wal
pos: *tick_parser_positions
- type: wlk
pos: *tick_parser_positions
- type: shm
pos: *tick_parser_positions
- type: ckpt
pos: *tick_parser_positions
relation:
- type: wtx-to-wtx
pos: *rel_parser_positions
- type: wtx-to-wal
pos: *rel_parser_positions
- type: wtx-to-wlk
pos: *rel_parser_positions
- type: wtx-to-shm
pos: *rel_parser_positions
- type: wtx-to-ckpt
pos: *rel_parser_positions
attr: |
This should make the logic and resource management a bit clearer. Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Pass them a node and a database name to cut down on the amount of snprintf in the tests themselves. Signed-off-by: Cole Miller <[email protected]>
This matches the setup we use in dqlite proper with the old vfs (see leader.c and db.c). It also just makes good sense if we're trying to exercise control over when checkpoints happen. Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Using a single allocation makes the code simpler, but is unsound: SQLite holds on to the pointers obtained from xShmMap and expects them to remain valid as more regions are mapped. This partially reverts commit 7f20503. Signed-off-by: Cole Miller <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First part of the review, I am not sure if GH is able to keep up when switching these many commits, so I am posting it in two parts.
* Intended for use by unit tests. The arguments are main file handles | ||
* (SQLITE_FCNTL_FILE_POINTER). | ||
*/ | ||
void vfs2_ut_sm_relate(sqlite3_file *orig, sqlite3_file *targ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are adding code for the tests here because of visibility I wonder if there is a better solution like making the inner fields visible for the tests, an attribute that makes this function uncallable from main code, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reluctant to move our struct file
or struct entry
into vfs2.h---it seems good to maintain the conciseness of that file and keep all the implementation details of vfs2 in one place. As for making the function uncallable outside tests, I agree it would be nice to be able to ifdef out bits of code so that they only exist if dqlite is being built for testing (and this is a good candidate). We'd need to tinker with the build system to implement this, because right now we don't (always) compile source files separately for the installable artifacts and the tests.
I think a good local improvement for this function specifically would be to just expose an accessor for the wtx sm, which will be useful also outside the unit tests (e.g. we'll want to sm_relate it to other state machines related to write transactions). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished the review. I think the code is looking very good Cole and kudos for all the comments! Even though it has a lot of moving pieces that you need to keep track of I found the code very easy to follow.
No major comments, mostly questions and suggestions.
*/ | ||
static struct shm_region *shm_grow(struct shm *shm) | ||
{ | ||
int index = shm->num_regions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you can move this closer to usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this---it's used by shm_init which is just two functions down (and the function in the middle, write_basic_hdr_cksums, is also used by shm_init). (I do think the overall order of functions in the file needs tidying up, just not sure how to achieve a quick improvement in this case.)
src/vfs2.c
Outdated
@@ -420,15 +452,10 @@ static void write_basic_hdr_cksums(struct wal_index_basic_hdr *bhdr) | |||
*/ | |||
static int shm_init(struct shm *shm, struct wal_hdr whdr) | |||
{ | |||
shm->regions = sqlite3_malloc64(VFS2_WAL_INDEX_REGION_SIZE); | |||
if (shm->regions == NULL) { | |||
struct shm_region *r0 = shm_grow(shm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should initialize shm here to zero just in case the caller is using uninitialized memory and that leads to bugs, I cannot think of any use case that requires setting something by hand.
BTW: another custom vector type we just created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should initialize shm here to zero just in case the caller is using uninitialized memory and that leads to bugs, I cannot think of any use case that requires setting something by hand.
I don't think this is true of any of the callsites right now, but better safe than sorry. I'll add the defensive initialization.
BTW: another custom vector type we just created.
I disclaim responsibility for this one since it's the same implementation from the old VFS :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again, I think maybe it would make more sense to have an invariant for the shm state machine that asserts that the other fields are zeroed when the state is SHM_EMPTY. Defensively zeroing them here would suggest that we expect the struct to be filled with garbage, but that's not the case, we expect it to be "initialized as empty" already. (Maybe shm_init
is not the best name for that reason.)
Signed-off-by: Cole Miller <[email protected]>
This matches the name that SQLite uses for this magic number. Signed-off-by: Cole Miller <[email protected]>
It's not compatible with how we determine WAL-cur at startup, and isn't load-bearing, so let's ditch it. Also tighten up try_read_wal_hdr so that weird WALs with 0 < n < WAL_HDR_SIZE bytes are treated as corrupt. Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Also check that there is an outstanding map before decrementing the refcount. Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
This PR contains a new unit test for vfs2 that simulates replicating a transaction between a leader and a follower, plus supporting fixes:
vfs2_wal_frame
withdqlite_vfs_frame
.xRandomness
from theunix
VFS to randomize the WAL header salts.mxFrame
increases on the follower.Signed-off-by: Cole Miller [email protected]