Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: block sync hangs on initial node start #1314

Merged
merged 20 commits into from
Oct 7, 2024

Conversation

rodrigo-o
Copy link
Collaborator

@rodrigo-o rodrigo-o commented Sep 30, 2024

Motivation

This PR tries to fix the hanging that happens often in Sepolia, specially at the first Sync.

Description

The issue was related to peer scoring/penalizing, right now we had a naive implementation where penalization immediatly removed a peer. I made some small changes to have a better scenario, specially on Sepolia where this issue happen the most. Ultimately what made the node hang was getting out of peers and ending in a sleep cycle that never released LibP2P to handle incoming new peers. The points tackled are:

  • Add a couple of seconds more before the sync starts
  • Implement a really simple penalizing algorithm that just decrease the score of a peer every time it's penalized
  • Implement a naive scoring algorithm that takes of a small pool of good-scored peers

This PR is enough for most situations, it will be rare to end up with an empty peerbook, and in case it happens we'll see an error quickly, failing instead of leaving us waiting. There are some follow-ups but are lower in priority than for example #1309:

Resolves #1308

@rodrigo-o rodrigo-o changed the title feat: pruning and sync issues on initial node start feat: block sync hangs on initial node start Oct 2, 2024
@@ -81,14 +111,10 @@ defmodule LambdaEthereumConsensus.P2P.Peerbook do
defp prune() do
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do something similar to the peer selection in the pruning and just prune the lower score ones but is probably something to tackle in a PR that focus on the peer-scoring algorithm.

@rodrigo-o rodrigo-o marked this pull request as ready for review October 2, 2024 23:34
@rodrigo-o rodrigo-o requested a review from a team as a code owner October 2, 2024 23:34
@rodrigo-o rodrigo-o changed the title feat: block sync hangs on initial node start fix: block sync hangs on initial node start Oct 3, 2024
Copy link
Collaborator

@Arkenan Arkenan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@rodrigo-o rodrigo-o merged commit 378b2fd into main Oct 7, 2024
13 checks passed
@rodrigo-o rodrigo-o deleted the sync-issues-on-initial-node-run branch October 7, 2024 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

The block sync sometimes hangs and need to restart the node (Sepolia long run)
2 participants