Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve fetch_historic_logs_stream logic #108

Open
SozinM opened this issue Sep 19, 2024 · 6 comments
Open

Improve fetch_historic_logs_stream logic #108

SozinM opened this issue Sep 19, 2024 · 6 comments

Comments

@SozinM
Copy link

SozinM commented Sep 19, 2024

I found a small problem in from/to intervals during fetch_historic_logs_stream.
To determine the next block in case when we had any logs returned in this range we are taking the block number of the last block. This causes the following problem: we are ignoring the fact that we are already processed some blocks at the end of the range (if they were empty).
Example:
Case 1
Range 0-100 returned only one log with block number 50
The current implementation will query the next batch with block range 50-...
Case 2
Range 0-100 returned no logs
The current implementation will query the next batch with block range 100-...

@SozinM SozinM changed the title Improve fetch_historic_logs_stream to_block presition Improve fetch_historic_logs_stream logic Sep 19, 2024
@joshstevens19
Copy link
Owner

So i thought all RPC never return you a cross block response aka if block 7 has 10 logs they all come back in the request, are you saying that's not the case for hypersync?

just to be clear:

i fetch 1000 event logs the last log is block number 10 is there ever a chance on logs 1001 > 1500 has more event for block number 10?

@SozinM
Copy link
Author

SozinM commented Sep 19, 2024

No, it's a little different case
We could have blocks without relevant logs for smart contract. In this case such a block at the end of the range (considering this range returned at least some logs) would be queried again the the next query.
This is true for the RPC implementation too.
Reason for this as I said that when we have logs in the block range we set next_from_block as block_number of the last log, instead of to_block + 1 of the query executed

@joshstevens19
Copy link
Owner

ok but in this case

Case 1
Range 0-100 returned only one log with block number 50
The current implementation will query the next batch with block range 50-...
Case 2
Range 0-100 returned no logs
The current implementation will query the next batch with block range 100-...

are you expecting it to be

Case 1
Range 0-100 returned only one log with block number 50
The current implementation will query the next batch with block range 100-...

so your saying we doing wasted calls in this example here - as we know nothing is in 0-100 blocks as we just done the request

@joshstevens19
Copy link
Owner

joshstevens19 commented Sep 19, 2024

         if logs_empty {
                info!(
                    "{} - No events found between blocks {} - {}",
                    info_log_name, from_block, to_block
                );
                let next_from_block = to_block + 1;
                return if next_from_block > snapshot_to_block {
                    None
                } else {
                    let new_to_block = calculate_process_historic_log_to_block(
                        &next_from_block,
                        &snapshot_to_block,
                        &max_block_range_limitation,
                    );

                    debug!(
                        "{} - {} - new_from_block {:?} new_to_block {:?}",
                        info_log_name,
                        IndexingEventProgressStatus::Syncing.log(),
                        next_from_block,
                        new_to_block
                    );

                    Some(ProcessHistoricLogsStreamResult {
                        next: current_filter
                            .set_from_block(next_from_block)
                            .set_to_block(new_to_block),
                        max_block_range_limitation,
                    })
                };
            }

            if let Some(last_log) = last_log {
                let next_from_block = last_log
                    .block_number
                    .expect("block number should always be present in a log") +
                    U64::from(1);
                debug!(
                    "{} - {} - next_block {:?}",
                    info_log_name,
                    IndexingEventProgressStatus::Syncing.log(),
                    next_from_block
                );
                return if next_from_block > snapshot_to_block {
                    None
                } else {
                    let new_to_block = calculate_process_historic_log_to_block(
                        &next_from_block,
                        &snapshot_to_block,
                        &max_block_range_limitation,
                    );

                    debug!(
                        "{} - {} - new_from_block {:?} new_to_block {:?}",
                        info_log_name,
                        IndexingEventProgressStatus::Syncing.log(),
                        next_from_block,
                        new_to_block
                    );

                    Some(ProcessHistoricLogsStreamResult {
                        next: current_filter
                            .set_from_block(next_from_block)
                            .set_to_block(new_to_block),
                        max_block_range_limitation,
                    })
                };
            }

i wonder why i did this 🤔 in turn this would mean we could remove if logs_empty { and then the whole if let Some(last_log) = last_log {... downwards

@SozinM
Copy link
Author

SozinM commented Sep 19, 2024

i wonder why i did this 🤔 in turn this would mean we could remove if logs_empty { and then the whole if let Some(last_log) = last_log {... downwards

I had the same idea but it would require some thorough testing

@joshstevens19
Copy link
Owner

il have a think why i did this and try to see if any reason why we cant do the above thanks for raising

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants