-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block history when restarting process #22
Comments
@nionis By default, Blockstream retains 100 blocks of history. If you stop reconciling blocks for less than that, then give it latest head block it will proceed to fetch parents until it can re-link with the in-memory chain it has, then it will announce each of the blocks from oldest to newest to any listeners. If you pause for longer than the retention limit, then it will walk backwards fetching parents until it has fetched I don't think I fully understand the problem you have/are trying to solve, but in case it helps, what Augur does is they sync manually (using much more efficient bulk syncing mechanisms) up to about block |
sounds like Augur's solution is what we're looking for. The main scenario we're worried about is making sure that's blockstream's internal state stays consistent through restarts so that it can pick up exactly where it left off, even if it was in the middle of a fork (because gnarly still needs to know about the fork in order to revert any changes it's made) |
Due to other issues with Geth/Parity, Augur also only persists data that is synced via the initdal bulk sync. Once EIP-234 is implemented in both Geth and Parity they can go back to persisting data from blockstream. This is a hamfisted solution, but it makes things slightly better. |
just to make sure I understand correctly:
I have been experimenting here |
Correct, when a new head block is received, blockstream will check to see if the current head it has matches the parent of the new block. If it does, then we have a new head and are done. If it doesn't, then it will look at that block's parent to see if it is the parent of the new block, if not walk back again and repeat. It does this until either it finds a parent in its history or it walks off the end of its internal history (100 blocks by default). If it finds a parent then it will fire removal notifications for logs/blocks it has on top of the parent it found, and then once it has rolled back far enough it will then attach the new head. If it walks off the end of its history, then it will fetch the new block's parent and repeat the above process again. It will do this until either it finds a way to link the two chains, or it has an entire new chain that is I believe that last scenario is not ideal, because it actually failed to reconcile the chains. I just filed an issue to make it throw an exception in that last scenario: #24 rather than claiming to have reconciled. I'm curious about your usage scenario. How are you detecting an "invalid block" before blockstream does? Also, it feels like you would be better off not pausing and just letting blockstream do its job and deal with all of the "problems". |
@MicahZoltu Actually we wont be pausing, sorry for misunderstanding, Gnarly is an ethereum indexer, and it uses ethereumjs-blockstream to make sure it keeps track of the correct blocks, those blocks are taken in by a reducer and we create a state. The pausing is actually more like this, if Gnarly crashes / restarts at a time where the chain is invalid. What we are thinking to do, is that we save the last 10 blocks we receive to a DB, if we have a crash / restart, we can provide those blocks to ethereumjs-blockstream and then afterwards provide ethereumjs-blockstream with the latest block, if this is done and the "gap" is below the retention limit it should be all okay. So I think the solution for our use case is found |
original issue at gnarly
If we are following the chain but then stop processing during a short-lived fork, once we restart the process the blockstream queue will be empty so it will blindly trust that the next block it sees is the main chain because it can't do any resolution since it doesn't know the blocks before that to calculate longest chain.
The text was updated successfully, but these errors were encountered: