Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fast-sync. #2678

Closed
wants to merge 5 commits into from
Closed

Add fast-sync. #2678

wants to merge 5 commits into from

Conversation

shamil-gadelshin
Copy link
Member

This PR introduces fast-sync to subspace-node complementing the existing sync mechanisms: DSN-sync and Substrate sync.

Fast-sync algorithm aims to improve the speed of the synchronization to the latest block by skipping the majority of the blocks and downloading only the minimum of required blocks and state.

Algorithm highlight:

  • download the two last segments from DSN
  • import the last block from the second last segment without execution bypassing regular checks and initialize the node's archiver
  • download state for the first block of the last segment
  • import and execute all the remaining blocks from the last segment
  • pass the control to the next sync algorithm (DSN-sync or Substrate sync)
  • update Substrate sync parameters (remove block gap and update common blocks for connected peers) - this forces Substrate sync to get only remaining blocks.

Here is the PR for polkadot-sdk: autonomys/polkadot-sdk#15

Code contributor checklist:

@nazar-pc
Copy link
Member

I see you created a new PR with everything in it again. Are you planning to convert this into a series of logically structured PRs extracted out of this?

@shamil-gadelshin
Copy link
Member Author

Modify CLI configuration to support fast-sync.

I'm not sure what you mean by that - maybe only one out of five commits from this PR make sense to move out but even that one (archiver modification) can be understood better in this context, and the other logical part is separated completely and targets a different repository. However, I try to follow your PR and provide descriptions for each commit:

  • "dependency update" - modifies commit hashes for frontier and polkadot-sdk and will be updated multiple times before the final merge,
  • "add fast-sync algorithm" - adds a new file with the "fast-sync" algorithm,
  • "integrate fast-sync algorithm into dsn-sync" - provides dependencies for the new algorithm and adds it within DSN-sync code,
  • "modify CLI configuration to support fast-sync" - adds a new flag to enable fast-sync algorithm and modifies existing flag state-pruning and sync_mode config option to support the new algorithm,
  • "modify archiver" - changes the archiver to support fast-sync: adds an option to reinitialize it when necessary rather than only on start.

@shamil-gadelshin shamil-gadelshin mentioned this pull request Apr 15, 2024
1 task
@dariolina
Copy link
Member

@shamil-gadelshin please make a PR to add a spec for fast sync in this section https://github.com/subspace/protocol-specs/blob/main/docs/consensus/consensus_chain.md#synchronization

@nazar-pc
Copy link
Member

The most important thing, implementation, is present in the PR, which is good. But description could be improved a lot.

Description only contains a very short description of what is done, while I can already figure that out from either the thing you're trying to implement (not linked) or the code itself. What is almost completely lacking is "why?". Last comment helps, but it is still doesn't explain the challenges involved, potential solutions considered, why you chose the implementation path you chose, etc. Basically things that would help reviewers and auditors understand why you implemented it this way. Surely this wasn't the only way.

For example look at the description Ning provided in #2617 for a much smaller change set.

Similarly if you discovered 5 weeks ago that archiver is not supporting new mode of operation, I'd expect a PR back then that addresses this issue (since it will not break anything) with explanation of the issue, potential ways to solve it and why you selected this particular solution. Also it would ideally have refactoring and feature work separated (even though it is trivial) and explanation why you made a bunch of things public that were supposed to remain private (and are not used in this PR anyway).

Also I don't see any upstream PR or issue in polkadot-sdk exposing missing Substrate APIs even though you probably new we'll need them for multiple weeks now. autonomys/polkadot-sdk#15 is similarly scarse on details, it also only mentiones what is done, bt not why or what were the alternatives.

If we landed those simpler PRs and upstream changes first, it would have been possible to make a much smaller and focused PR implementing just the sync itself and not everything at once.

Remember when I was working on PoT. There was a long series of PRs each implementing a logical component with expalanation like #1918 and some upstream Substrate PRs addressing things I faced along the way (see link to polkadot-sdk PR in there).

@shamil-gadelshin
Copy link
Member Author

The main reason why this PR is monolith is because it was unknown what components (changes) were required until the final stage. Even the archiver was changed recently again to restore "inherent data check". Now it can be refactored, changed, and potentially split into several PRs.

Let's start with the archiver PR and see how it goes.

@nazar-pc nazar-pc deleted the fast-sync-v3 branch September 26, 2024 05:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants