Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two-node vfs2 unit test #674

Closed
wants to merge 73 commits into from

Conversation

cole-miller
Copy link
Contributor

@cole-miller cole-miller commented Jul 18, 2024

This PR contains a new unit test for vfs2 that simulates replicating a transaction between a leader and a follower, plus supporting fixes:

  • Remove FLUSH from the state machine and vfs2_commit_barrier from the public API.
  • Several fixes picked from the previous integration PR (see individual commits for more details):
    • Replace vfs2_wal_frame with dqlite_vfs_frame.
    • Calculate and set the checksum fields in the WAL header.
    • Initialize the page size for a database, if not already set, when calling vfs2_add_uncommitted.
    • Call sqlite3_randomness instead of xRandomness from the unix VFS to randomize the WAL header salts.
  • Updating the hash table in the first shm region of the WAL-index whenever mxFrame increases on the follower.
  • Properly writing the WAL header when the WAL is written for the first time.
  • Finalizing the memory allocated for the WAL-index when an entry is closed, rather than when xShmUnmap is last called.

Signed-off-by: Cole Miller [email protected]

"Apply" is the verb we use for what happens when a raft log entry is
committed, so avoid it for these functions that are associated with
appending and truncating entries.

Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
This happens whenever we write part of a transaction to the WAL after
the first frame header.

Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
We don't move to ACTIVE or HIDDEN if no frames were written, so we
should stay in BASE here as well.

Signed-off-by: Cole Miller <[email protected]>
vfs2_wal_frame carries the commit marker explicitly, so it doesn't match
COMMAND_FRAMES, which only carries the page numbers and page data.
dqlite_vfs_frame is defined with this in mind. To get the commit markers
on the follower side, we start with the size of the main file in pages
and take the maximum with each page number up to the commit frame in the
WAL. This matches what the in-memory VFS does.

Signed-off-by: Cole Miller <[email protected]>
A follower that first touches a database when it receives a
COMMAND_FRAMES entry needs to initialize the page size at that point
(absent a guarantee that we run `PRAGMA page_size` immediately after
sqlite3_open).

Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
The length of the polled transaction is returned in the WAL slice.

Signed-off-by: Cole Miller <[email protected]>
By abbreviating two common operations.

Signed-off-by: Cole Miller <[email protected]>
This ties the destruction of the shared memory to that of the owning
entry, because they are created together. If shm finalization is tied to
a call to xShmUnmap, then it can't happen unless SQLite called xShmMap
in the first place, and this is not guaranteed to happen. For example,
if we start and stop a follower node without ever running a checkpoint
or transaction, SQLite will never get the change to call xShmMap.

Signed-off-by: Cole Miller <[email protected]>
@cole-miller cole-miller marked this pull request as ready for review July 19, 2024 01:52
@cole-miller cole-miller requested a review from just-now July 19, 2024 01:53
Copy link

codecov bot commented Jul 19, 2024

Codecov Report

Attention: Patch coverage is 71.76471% with 144 lines in your changes missing coverage. Please review.

Project coverage is 78.15%. Comparing base (ee07596) to head (3951d1e).
Report is 100 commits behind head on master.

Files with missing lines Patch % Lines
src/vfs2.c 65.63% 80 Missing and 64 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #674      +/-   ##
==========================================
+ Coverage   77.56%   78.15%   +0.59%     
==========================================
  Files         197      197              
  Lines       27638    27962     +324     
  Branches     5486     5532      +46     
==========================================
+ Hits        21438    21855     +417     
+ Misses       4326     4262      -64     
+ Partials     1874     1845      -29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@letFunny letFunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Cole this is looking very good! The new comments and the commit descriptions are super helpful and make the code much easier to follow (though you extensive explanations might have also helped :P). I left a bunch of comments but there are a lot of nitpicks and questions.

The only part that I am not sure I understand is the removal of the FLUSH state, and I couldn't find the reasoning behind that in the commit or in the code. It will probably make sense once we discuss it offline and/or once the new diagram is in place.

src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
test/unit/test_vfs2.c Outdated Show resolved Hide resolved
test/unit/test_vfs2.c Show resolved Hide resolved
src/vfs2.c Outdated
@@ -1727,6 +1727,7 @@ int vfs2_add_uncommitted(sqlite3_file *file,
* (wal_cursor). */
e->shm_locks[WAL_WRITE_LOCK] = VFS2_EXCLUSIVE;

uint32_t start = e->wal_cursor;
struct wal_index_full_hdr *ihdr = get_full_hdr(e);
uint32_t mx = ihdr->basic[0].mxFrame;
if (mx > 0 && ihdr->nBackfill == mx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nit]: Should we add start != 0 && ... for clarity with the condition below? That way we don't have to remember that mx < wal_cursor && mx > 0 => wal_cursor > 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks, this could use refactoring.

Copy link
Contributor Author

@cole-miller cole-miller Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic has been substantially re-organized in a follow-up commit (43ec47c), hopefully it's more clear now!

@cole-miller
Copy link
Contributor Author

As a supplement to the observability improvements in this PR, here's a chronoscope.yaml:

.p1: &tick_parser_positions
  id: 10
  pid: 8
  type: 6
  time: 1
  event: 11

.p2: &rel_parser_positions
  type: 6
  orig_pid: 8
  dest_pid: 10
  orig_id: 12
  dest_id: 14

.p3: &attr_parser_positions
  id: 7
  pid: 5
  type: 3
  name: 8
  value: 1

tick:
  - type: wtx
    pos: *tick_parser_positions
  - type: wal
    pos: *tick_parser_positions
  - type: wlk
    pos: *tick_parser_positions
  - type: shm
    pos: *tick_parser_positions
  - type: ckpt
    pos: *tick_parser_positions

relation:
  - type: wtx-to-wtx
    pos: *rel_parser_positions
  - type: wtx-to-wal
    pos: *rel_parser_positions
  - type: wtx-to-wlk
    pos: *rel_parser_positions
  - type: wtx-to-shm
    pos: *rel_parser_positions
  - type: wtx-to-ckpt
    pos: *rel_parser_positions

attr:

This should make the logic and resource management a bit clearer.

Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Pass them a node and a database name to cut down on the amount of
snprintf in the tests themselves.

Signed-off-by: Cole Miller <[email protected]>
This matches the setup we use in dqlite proper with the old vfs (see
leader.c and db.c). It also just makes good sense if we're trying to
exercise control over when checkpoints happen.

Signed-off-by: Cole Miller <[email protected]>
Using a single allocation makes the code simpler, but is unsound: SQLite
holds on to the pointers obtained from xShmMap and expects them to
remain valid as more regions are mapped.

This partially reverts commit 7f20503.

Signed-off-by: Cole Miller <[email protected]>
Copy link
Contributor

@letFunny letFunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First part of the review, I am not sure if GH is able to keep up when switching these many commits, so I am posting it in two parts.

src/vfs2.c Show resolved Hide resolved
src/vfs2.c Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
* Intended for use by unit tests. The arguments are main file handles
* (SQLITE_FCNTL_FILE_POINTER).
*/
void vfs2_ut_sm_relate(sqlite3_file *orig, sqlite3_file *targ);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are adding code for the tests here because of visibility I wonder if there is a better solution like making the inner fields visible for the tests, an attribute that makes this function uncallable from main code, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reluctant to move our struct file or struct entry into vfs2.h---it seems good to maintain the conciseness of that file and keep all the implementation details of vfs2 in one place. As for making the function uncallable outside tests, I agree it would be nice to be able to ifdef out bits of code so that they only exist if dqlite is being built for testing (and this is a good candidate). We'd need to tinker with the build system to implement this, because right now we don't (always) compile source files separately for the installable artifacts and the tests.

I think a good local improvement for this function specifically would be to just expose an accessor for the wtx sm, which will be useful also outside the unit tests (e.g. we'll want to sm_relate it to other state machines related to write transactions). What do you think?

src/vfs2.c Show resolved Hide resolved
src/vfs2.c Show resolved Hide resolved
src/vfs2.c Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
Copy link
Contributor

@letFunny letFunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished the review. I think the code is looking very good Cole and kudos for all the comments! Even though it has a lot of moving pieces that you need to keep track of I found the code very easy to follow.

No major comments, mostly questions and suggestions.

src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
src/vfs2.c Outdated Show resolved Hide resolved
*/
static struct shm_region *shm_grow(struct shm *shm)
{
int index = shm->num_regions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you can move this closer to usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this---it's used by shm_init which is just two functions down (and the function in the middle, write_basic_hdr_cksums, is also used by shm_init). (I do think the overall order of functions in the file needs tidying up, just not sure how to achieve a quick improvement in this case.)

src/vfs2.c Show resolved Hide resolved
src/vfs2.c Outdated
@@ -420,15 +452,10 @@ static void write_basic_hdr_cksums(struct wal_index_basic_hdr *bhdr)
*/
static int shm_init(struct shm *shm, struct wal_hdr whdr)
{
shm->regions = sqlite3_malloc64(VFS2_WAL_INDEX_REGION_SIZE);
if (shm->regions == NULL) {
struct shm_region *r0 = shm_grow(shm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should initialize shm here to zero just in case the caller is using uninitialized memory and that leads to bugs, I cannot think of any use case that requires setting something by hand.

BTW: another custom vector type we just created.

Copy link
Contributor Author

@cole-miller cole-miller Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should initialize shm here to zero just in case the caller is using uninitialized memory and that leads to bugs, I cannot think of any use case that requires setting something by hand.

I don't think this is true of any of the callsites right now, but better safe than sorry. I'll add the defensive initialization.

BTW: another custom vector type we just created.

I disclaim responsibility for this one since it's the same implementation from the old VFS :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again, I think maybe it would make more sense to have an invariant for the shm state machine that asserts that the other fields are zeroed when the state is SHM_EMPTY. Defensively zeroing them here would suggest that we expect the struct to be filled with garbage, but that's not the case, we expect it to be "initialized as empty" already. (Maybe shm_init is not the best name for that reason.)

This matches the name that SQLite uses for this magic number.

Signed-off-by: Cole Miller <[email protected]>
It's not compatible with how we determine WAL-cur at startup, and isn't
load-bearing, so let's ditch it. Also tighten up try_read_wal_hdr so
that weird WALs with 0 < n < WAL_HDR_SIZE bytes are treated as corrupt.

Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Also check that there is an outstanding map before decrementing the
refcount.

Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants