Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup(state): Run possible minor database format version upgrades before deleting the disk version #8761

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 56 additions & 25 deletions zebra-state/src/service/finalized_state/zebra_db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -107,14 +107,6 @@ impl ZebraDb {
)
.expect("unable to read database format version file");

DiskDb::try_reusing_previous_db_after_major_upgrade(
&RESTORABLE_DB_VERSIONS,
format_version_in_code,
config,
&db_kind,
network,
);

// Log any format changes before opening the database, in case opening fails.
let format_change = DbFormatChange::open_database(format_version_in_code, disk_version);

Expand All @@ -128,25 +120,64 @@ impl ZebraDb {
let debug_skip_format_upgrades = read_only
|| ((cfg!(test) || cfg!(feature = "shielded-scan")) && debug_skip_format_upgrades);

// Open the database and do initial checks.
let mut db = ZebraDb {
config: Arc::new(config.clone()),
debug_skip_format_upgrades,
format_change_handle: None,
// After the database directory is created, a newly created database temporarily
// changes to the default database version. Then we set the correct version in the
// upgrade thread. We need to do the version change in this order, because the version
// file can only be changed while we hold the RocksDB database lock.
db: DiskDb::new(
config,
db_kind,
format_version_in_code,
network,
column_families_in_code,
read_only,
),
let db_kind = db_kind.as_ref();
let column_families_in_code: Vec<_> = column_families_in_code.into_iter().collect();
let new_db = |format_version| {
ZebraDb {
config: Arc::new(config.clone()),
debug_skip_format_upgrades,
format_change_handle: None,
// After the database directory is created, a newly created database temporarily
// changes to the default database version. Then we set the correct version in the
// upgrade thread. We need to do the version change in this order, because the version
// file can only be changed while we hold the RocksDB database lock.
db: DiskDb::new(
config,
db_kind,
&format_version,
network,
column_families_in_code.clone(),
read_only,
),
}
};

match format_change.clone() {
DbFormatChange::Upgrade {
older_disk_version,
newer_running_version,
} if !debug_skip_format_upgrades
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a correctness note about not running if read_only so there's no corruption if multiple Zebra instances try to read and write the same files.

&& newer_running_version.major == (older_disk_version.major + 1)
&& RESTORABLE_DB_VERSIONS.contains(&newer_running_version.major) =>
{
info!("upgrading database format to the next major version");
// use the latest possible minor version to trigger all remaining format upgrades for the older major version
let db = new_db(Version::new(older_disk_version.major, u64::MAX, u64::MAX));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's document why we're using u64::MAX.

arya2 marked this conversation as resolved.
Show resolved Hide resolved
let finalized_tip = db.finalized_tip_height();

// It's fine to mark the database format as upgraded later if the database is empty
if finalized_tip.is_some() {
warn!("applying minor database format upgrades ahead of major format version bump, this may take a few minutes");
let (_cancel_tx, cancel_rx) = std::sync::mpsc::channel();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not using the sender will make Zebra irresponsive to shutdowns when the upgrade is running.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

std::thread::spawn(move || {
    while !is_shutting_down() {
        if cancel_tx.is_closed() {
            return;
        }

        std::thread::sleep(Duration::from_millis(1));
    }
    cancel_tx.send(CancelFormatChange);
});

might work here, but there's no is_closed() method on this channel, so we'd need to replace it with a watch channel first.

format_change
.run_format_change_or_check(&db, finalized_tip, &cancel_rx)
.expect("should not return an error without a cancellation message");
}

DiskDb::try_reusing_previous_db_after_major_upgrade(
&RESTORABLE_DB_VERSIONS,
format_version_in_code,
config,
db_kind,
network,
);
}
_ => {}
}

// Open the database and do initial checks.
let mut db = new_db(format_version_in_code.clone());

db.spawn_format_change(format_change);

db
Expand Down
Loading