Skip to content
This repository has been archived by the owner on Oct 18, 2023. It is now read-only.

Commit

Permalink
bottomless: checkpoint before initializing bottomless (#726)
Browse files Browse the repository at this point in the history
* bottomless: checkpoint before initializing bottomless

Due to a bug in wallog recovery, we need to checkpoint
the database *strictly before* we initialize bottomless.
A proper fix should be to use our virtual WAL methods
for checkpointing, but there's an initialization cycle
and resolving it will be a larger patch - a connection
with WAL methods wants us to already have the replication
logger created, and replication logger wants to perform
a checkpoint on creation.
As a mid-term solution, we just perform the forbidden
regular checkpoint before bottomless is ever launched.
Combined with the fact that bottomless treats existing
databases as the source of truth, it just creates
a new backup generation and continues working properly.

The following scenario was buggy before:
 1. We leave the db in the state where some WAL frames
   still exist in data-wal file
 2. We restart sqld
 3. bottomless is initialized, it reuses the existing db
    and WAL frames and uploads them to S3, to avoid
    creating a potentially costly snapshot
 4. ReplicationLogger::new() incorrectly calls
    sqlite3_wal_checkpoint which swipes data from under
    bottomless.
 5. Bottomless thinks it hasn't checkpointed and continues
    to write WAL frames. As a result, it writes garbage
    to S3, because the db was checkpointed outside
    of bottomless control

* fmt fix
  • Loading branch information
psarna committed Oct 3, 2023
1 parent 99ff0c9 commit 40cb9be
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 3 deletions.
7 changes: 5 additions & 2 deletions bottomless/src/replicator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1302,8 +1302,11 @@ impl Replicator {
},
};

tracing::info!("Restoring from generation {}", generation);
self.restore_from(generation, timestamp).await
let (action, recovered) = self.restore_from(generation, timestamp).await?;
tracing::info!(
"Restoring from generation {generation}: action={action:?}, recovered={recovered}"
);
Ok((action, recovered))
}

pub async fn get_last_consistent_frame(&self, generation: &Uuid) -> Result<u32> {
Expand Down
12 changes: 12 additions & 0 deletions sqld/src/namespace/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -657,6 +657,18 @@ impl Namespace<PrimaryDatabase> {

tokio::fs::create_dir_all(&db_path).await?;

// FIXME: due to a bug in logger::checkpoint_db we call regular checkpointing code
// instead of our virtual WAL one. It's a bit tangled to fix right now, because
// we need WAL context for checkpointing, and WAL context needs the ReplicationLogger...
// So instead we checkpoint early, *before* bottomless gets initialized. That way
// we're sure bottomless won't try to back up any existing WAL frames and will instead
// treat the existing db file as the source of truth.
if config.bottomless_replication.is_some() {
tracing::debug!("Checkpointing before initializing bottomless");
crate::replication::primary::logger::checkpoint_db(&db_path.join("data"))?;
tracing::debug!("Checkpointed before initializing bottomless");
}

let bottomless_replicator = if let Some(options) = &config.bottomless_replication {
let options = make_bottomless_options(options, name.clone());
let (replicator, did_recover) =
Expand Down
5 changes: 4 additions & 1 deletion sqld/src/replication/primary/logger.rs
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ unsafe impl WalHook for ReplicationLoggerHook {
assert_eq!(page_size, 4096);
let wal_ptr = wal as *mut _;
let last_valid_frame = wal.hdr.mxFrame;
tracing::trace!("Last valid frame before applying: {last_valid_frame}");
let ctx = Self::wal_extract_ctx(wal);

let mut frame_count = 0;
Expand Down Expand Up @@ -948,7 +949,9 @@ impl ReplicationLogger {
}
}

fn checkpoint_db(data_path: &Path) -> anyhow::Result<()> {
// FIXME: calling rusqlite::Connection's checkpoint here is a bug,
// we need to always call our virtual WAL methods.
pub fn checkpoint_db(data_path: &Path) -> anyhow::Result<()> {
let wal_path = match data_path.parent() {
Some(path) => path.join("data-wal"),
None => return Ok(()),
Expand Down

0 comments on commit 40cb9be

Please sign in to comment.