Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bela Verification Failed - account hash missing storage trie node #7620

Closed
siladu opened this issue Sep 16, 2024 · 6 comments
Closed

Bela Verification Failed - account hash missing storage trie node #7620

siladu opened this issue Sep 16, 2024 · 6 comments
Assignees
Labels
bug Something isn't working P3 Medium (ex: JSON-RPC request not working with a specific client library due to loose spec assumtion)

Comments

@siladu
Copy link
Contributor

siladu commented Sep 16, 2024

Successfully synced and bug-free nodes can occasionally still have trie node data missing, as discovered through running Bela BonsaiTreeVerifier

For example:

account hash 0x615cd098e18cd36c5a62016fb4e096c02cf71532abcb04f2e7a5661972a86932 
missing storage trie node for hash
 0x9987f9dc77f7516f7710cbf0514f6941d3905bc9c4e14c864586fea3f42856a6 and location 0x

Frequency: Three occurrences in recent burn ins.

Each time it has been the storage root hash, 0x that has been missing. Some child node data is present for this storage trie..

More details: https://github.com/Consensys/protocol-misc/issues/972#issuecomment-2339351262

@siladu siladu self-assigned this Sep 16, 2024
@siladu siladu added bug Something isn't working P3 Medium (ex: JSON-RPC request not working with a specific client library due to loose spec assumtion) labels Sep 16, 2024
@siladu
Copy link
Contributor Author

siladu commented Sep 16, 2024

@matkt How confident are you auto-heal would fix this if the trie node were to be accessed?

Is there a regression test for the auto-heal generally?

Do you think this should be higher priority than P3?

@matkt
Copy link
Contributor

matkt commented Sep 17, 2024

I have a fix we can try #7624
I think I found the root cause

Regarding auto heal , a simple test is to remove with Bela the root node of a contract that is used a lot and wait for a transaction that touch this one. We don't have a good test for this part , this should be removed in the future because it is switching FULL flat db to PARTIAL flat db

@siladu
Copy link
Contributor Author

siladu commented Sep 19, 2024

Ran fix #7624 on 20 nodes and found no error in either besu or bonsai tree verifier logs. Good news, but I think we need more testing to be fully confident though.

There was 1 rocksdb busy error during trie heal, which recovered.

@siladu
Copy link
Contributor Author

siladu commented Sep 22, 2024

Using https://github.com/hyperledger/besu/compare/main...matkt:feature/fix-healing-busy-issue?expand=1, synced another 20 nodes with no bugs ✅ x20

Two more recoverable RocksDB warnings during trie heal though.

@siladu
Copy link
Contributor Author

siladu commented Sep 26, 2024

I have tested this fix #7624 with total 80 nodes (using mainnet checkpoint sync) and not found the issue 🎉

2/80 did suffer from #7619 but it's unrelated to this fix.

@siladu
Copy link
Contributor Author

siladu commented Oct 2, 2024

Closed by #7624

@siladu siladu closed this as completed Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P3 Medium (ex: JSON-RPC request not working with a specific client library due to loose spec assumtion)
Projects
None yet
Development

No branches or pull requests

2 participants