[Epoch Sync] Initial implementation for Epoch Sync V4. #11934

robin-near · 2024-08-14T00:46:33Z

Remaining issues:

I still need to go through this one more time, but reviews can be done in the meantime.

codecov · 2024-08-14T01:11:34Z

Codecov Report

Attention: Patch coverage is 68.06723% with 152 lines in your changes missing coverage. Please review.

Project coverage is 71.52%. Comparing base (7d0e3d7) to head (b7e4906).
Report is 1 commits behind head on master.

Files	Patch %	Lines
chain/client/src/sync/epoch.rs	68.31%	40 Missing and 56 partials ⚠️
chain/epoch-manager/src/lib.rs	48.48%	5 Missing and 12 partials ⚠️
chain/client-primitives/src/types.rs	0.00%	11 Missing ⚠️
chain/network/src/network_protocol/mod.rs	0.00%	7 Missing ⚠️
chain/client/src/sync/header.rs	77.77%	2 Missing and 4 partials ⚠️
core/primitives/src/epoch_sync.rs	0.00%	4 Missing ⚠️
chain/client/src/info.rs	0.00%	3 Missing ⚠️
chain/chain/src/test_utils/kv_runtime.rs	0.00%	2 Missing ⚠️
chain/network/src/rate_limits/messages_limits.rs	0.00%	2 Missing ⚠️
tools/chainsync-loadtest/src/network.rs	0.00%	2 Missing ⚠️
... and 2 more

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #11934      +/-   ##
==========================================
+ Coverage   71.49%   71.52%   +0.03%     
==========================================
  Files         810      812       +2     
  Lines      164037   164452     +415     
  Branches   164037   164452     +415     
==========================================
+ Hits       117283   117631     +348     
+ Misses      41671    41670       -1     
- Partials     5083     5151      +68

Flag	Coverage Δ
backward-compatibility	`0.23% <0.00%> (-0.01%)`	⬇️
db-migration	`0.23% <0.00%> (-0.01%)`	⬇️
genesis-check	`1.34% <3.16%> (+<0.01%)`	⬆️
integration-tests	`38.51% <67.85%> (+0.10%)`	⬆️
linux	`71.27% <68.06%> (+0.02%)`	⬆️
linux-nightly	`71.11% <68.06%> (+0.05%)`	⬆️
macos	`54.24% <15.54%> (-0.11%)`	⬇️
pytests	`1.60% <3.16%> (+<0.01%)`	⬆️
sanity-checks	`1.41% <3.16%> (+<0.01%)`	⬆️
unittests	`65.46% <15.54%> (-0.14%)`	⬇️
upgradability	`0.27% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

staffik

Nice!
More tests could be added as a follow up task.

chain/client/src/client_actor.rs

chain/client/src/sync/epoch.rs

chain/epoch-manager/src/lib.rs

chain/network/src/client.rs

chain/network/src/test_loop.rs

core/chain-configs/src/client_config.rs

chain/network/src/client.rs

chain/network/src/peer_manager/peer_manager_actor.rs

chain/network/src/rate_limits/messages_limits.rs

chain/client-primitives/src/types.rs

chain/client/src/sync/header.rs

shreyan-gupta · 2024-08-20T21:39:16Z

chain/client/src/sync/epoch.rs

+        }
+
+        let peer = highest_height_peers
+            .choose(&mut rand::thread_rng())


Interesting, seems like we randomly choose a peer, might want to comment that here in case in the future we have smarter ways of selecting peers?

shreyan-gupta · 2024-08-20T21:52:10Z

chain/client/src/sync/epoch.rs

+            .choose(&mut rand::thread_rng())
+            .ok_or_else(|| Error::Other("No peers to request epoch sync from".to_string()))?;
+
+        *status = SyncStatus::EpochSync(EpochSyncStatus {


Is it possible that the peer we are requesting epoch sync from is itself too far behind the current head? Or could a node advertise a false/very far highest_block_height and thus put additional pressure on block sync on current node?

During sync, when booting a node, do we have an approx idea about what's the highest/latest height is?

We cannot know for sure what the real highest height is (as we would need the epoch sync proof in order to be sure... it's a circular dependency issue). The only thing we can do is try a peer and if it didn't give us a valid proof, ban it. That logic doesn't exist yet; we'll need to implement it but I'm not sure how. Let me add it to the issue list.

One possible idea is select random peer from top 50% of peers with largest highest_block_height or something similar to weed out outdated peers. Out of scope for this PR tho.

shreyan-gupta · 2024-08-20T22:09:21Z

integration-tests/src/test_loop/tests/epoch_sync.rs

+
+    // Check that the new node will reach a high height as well.
+    let new_node = node_datas.last().unwrap().client_sender.actor_handle();
+    test_loop.set_every_event_callback(move |test_loop_data| {


If we are already checking the new node catches upto old nodes in the run_until below, why do we need this block?

It's for debugging. When things don't work in a huge test, it's very useful to know in the visualizer where the node is, when looking at a specific event.

shreyan-gupta · 2024-08-20T22:10:38Z

integration-tests/src/test_loop/tests/epoch_sync.rs

+    });
+    let new_node = node_datas.last().unwrap().client_sender.actor_handle();
+    let node0 = node_datas[0].client_sender.actor_handle();
+    test_loop.run_until(


Are we checking that the sync happened through epoch_sync and not something like block sync? Should we be monitoring the sync status somewhere?

Ah yeah... we should do that somehow... How would that work hmm...

shreyan-gupta · 2024-08-20T22:11:31Z

core/async/src/test_loop.rs

@@ -104,6 +104,9 @@ pub struct TestLoopV2 {
    /// Shutdown flag. When this flag is true, delayed action runners will no
    /// longer post any new events to the event loop.
    shutting_down: Arc<AtomicBool>,
+    /// If present, a function to call to print something every time an event is
+    /// handled. Intended only for debugging.
+    every_event_callback: Option<Box<dyn FnMut(&TestLoopData)>>,


Honestly a bit wary about adding something like this... 😅 but if it's required, we can keep it.

[Epoch Sync] Initial implementation for Epoch Sync V4.

8e8c21b

robin-near requested a review from staffik August 14, 2024 00:46

robin-near requested a review from a team as a code owner August 14, 2024 00:46

robin-near requested review from tayfunelmas and shreyan-gupta August 14, 2024 00:46

staffik approved these changes Aug 19, 2024

View reviewed changes

robin-near added 3 commits August 19, 2024 11:57

Add comments.

065dab6

Merge remote-tracking branch 'origin/master' into HEAD

33868b3

Remove some debugging tracing prints.

d5e8df9

shreyan-gupta approved these changes Aug 20, 2024

View reviewed changes

robin-near added 2 commits August 22, 2024 12:12

Merge remote-tracking branch 'origin/master' into HEAD

72cb743

Address comments.

b7e4906

robin-near enabled auto-merge August 22, 2024 19:42

robin-near added this pull request to the merge queue Aug 22, 2024

Merged via the queue into near:master with commit 3a9c72a Aug 22, 2024
28 of 30 checks passed

robin-near deleted the epoch4 branch August 22, 2024 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Epoch Sync] Initial implementation for Epoch Sync V4. #11934

[Epoch Sync] Initial implementation for Epoch Sync V4. #11934

robin-near commented Aug 14, 2024

codecov bot commented Aug 14, 2024 •

edited

Loading

staffik left a comment

shreyan-gupta Aug 20, 2024

robin-near Aug 22, 2024

shreyan-gupta Aug 20, 2024

robin-near Aug 20, 2024 •

edited

Loading

shreyan-gupta Aug 20, 2024

shreyan-gupta Aug 20, 2024

robin-near Aug 20, 2024

shreyan-gupta Aug 20, 2024

robin-near Aug 20, 2024

shreyan-gupta Aug 20, 2024

[Epoch Sync] Initial implementation for Epoch Sync V4. #11934

[Epoch Sync] Initial implementation for Epoch Sync V4. #11934

Conversation

robin-near commented Aug 14, 2024

codecov bot commented Aug 14, 2024 • edited Loading

Codecov Report

staffik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robin-near Aug 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Aug 14, 2024 •

edited

Loading

robin-near Aug 20, 2024 •

edited

Loading