Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create mappings from a specified start index #3194

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

teor2345
Copy link
Contributor

Partial mapping re-creation feature

This PR allows the user to specify a piece index to start creating object mappings from. If no piece index is supplied, mappings are created from genesis. (The previously documented behaviour.)

This allows clients to restart a node from their last known piece index after a downtime. Since the mappings between genesis and that piece index don't need to be re-created, this increases recovery speed on long chains.

Internally, the piece index is converted to its segment index, and mappings are created starting at that segment. Re-archiving the whole segment can cause duplicate mappings. But if the client didn't receive and persist all the mappings that were queued by the archiver, those duplicates are needed to fill in the gaps.

Mapping re-creation bug fix

This PR also fixes a bug where mappings were skipped up until the last archived segment. --create-object-mappings was supposed to re-create mappings from genesis, but instead it was creating them from the first segment that the node didn't have available locally.

Code contributor checklist:

@teor2345 teor2345 added bug Something isn't working enhancement New feature or request node Node (service library/node app) labels Oct 30, 2024
@teor2345 teor2345 self-assigned this Oct 30, 2024
Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea makes perfect sense, but I'm not a big fan of the implementation. Left some questions and suggestions. I think the biggest thing will be changing target from segment to block number, which will simplify it for everyone.

@@ -381,6 +381,13 @@ where
.rev()
.filter_map(|segment_index| segment_headers_store.get_segment_header(segment_index))
{
// If we're re-creating mappings for existing segments, ignore those segments. This
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does achieve the goal, but what I think would be a better fix is to not call this function at all and remove create_object_mappings from its arguments (if this is a long-term desired behavior.

Updated code in the later commit is also questionable, why writing explicit segment_index >= target_segment_index is the same exact check is done by create_object_mappings.for_segment(segment_index) { below?

#[derive(Copy, Clone, Debug, Default, Eq, PartialEq)]
pub enum CreateObjectMappings {
/// Start creating object mappings from this segment index.
Segment(SegmentIndex),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is helpful, but this would be a better name for the variant:

Suggested change
Segment(SegmentIndex),
StartSegment(SegmentIndex),

At the same time I'm wondering why wouldn't it be a better API to specify a block number rather than a segment here from end user point of view? I'd expect them to track blocks instead of segments (especially since we emit mappings after every block now).

It will also likely reduce invasive changes in this file since there will be no need to track segments anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that’s an interesting question. If you just have a list of mappings, how will you know which blocks they’re from?

I was expecting the client/user to take the last piece index from their mappings, and supply it on the command line. I could change this type to PieceIndex to make that clearer?
(And then do the checks against the segment index for that piece, because that’s what we have in the code.)

I’ll ask Carlos what he’s expecting to do for the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had an idea about how to do this:

We add the block number to each subscription response, at the top level. So each response would look like:

  • block number
  • list of mappings

The client could then track the latest block number it is up to, but wouldn’t need any earlier block numbers. (Because block numbers are not required to fetch objects, just for recovery after downtime.)

This wouldn’t add much to the size of the response, because there’s only one block number for each block with mappings.

We’d just have to repeat the block number if we split a block’s mappings between batches, but that seems ok.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

Comment on lines +376 to +392
impl FromStr for CreateObjectMappings {
type Err = String;

fn from_str(input: &str) -> Result<Self, Self::Err> {
match input {
"disabled" => Ok(Self::Disabled),
piece_index => piece_index
.parse::<u64>()
.map(|index| Self::from_piece_index(index.into()))
.map_err(|_| {
"Unsupported create object mappings setting: \
use a piece index number, or 'disabled'"
.to_string()
}),
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not have it here. It only really makes sense for CLI purposes (which is why it is added), but doesn't really belong here. Consider having a mirror data structure in CLI and converting it into this one there.

Comment on lines +404 to +418
/// Create object mappings from the start of the chain.
pub fn from_genesis() -> Self {
Self::from_piece_index(PieceIndex::ZERO)
}

/// Create object mappings from the segment containing the given piece index.
/// All mappings in the segment will be created, including those before this piece.
pub fn from_piece_index(piece_index: PieceIndex) -> Self {
Self::Segment(piece_index.segment_index())
}

/// Don't create object mappings.
pub fn disabled() -> Self {
Self::Disabled
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Usefulness of these constructors is questionable, I'd probably remove them and let user construct the variant they want directly.

Comment on lines 396 to 398
/// --dev mode enables mappings from genesis, unless a piece index has already been specified.
#[arg(long, default_value_t, default_missing_value("0"), num_args(0..=1))]
create_object_mappings: CreateObjectMappings,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you want is to simply make it optional, then you can check for None and replace it with placeholder value when --dev is used or return an error to the user otherwise. See how it is done for --chain and --reward-address in the farmer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request node Node (service library/node app)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants