Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Supporting historical sync source alternative to RPC #888

Closed
moose-code opened this issue May 16, 2024 · 1 comment
Closed

[Feature] Supporting historical sync source alternative to RPC #888

moose-code opened this issue May 16, 2024 · 1 comment

Comments

@moose-code
Copy link

moose-code commented May 16, 2024

Hi there!

I've been doing some experimenting with using an alternative data source (as opposed to the RPC) for the default friend-tech ponder indexer created in the pnpm create ponder flow.

Its an interesting case as there more than 6 million events to be indexed and from what I can tell it will take more than 15hours to sync using a paid RPC service. My experimentation goal was to try and populate the ponder_sync.db >100x faster (than using an RPC) and then being able to start the ponder service and allow it to index using this inserted data.

So far I've found some pretty interesting results:
1 - I am able to fetch all the required log, block and transaction data for the ponder_sync.db in this case (+6m events, blocks and txs) in around 12minutes from scratch.
2 - Writing to the sqlite db is currently the bottleneck, and seems to take in the order of 24minutes for the fetched data to be written and persisted to the db.

The experimental approach essentially opens a connection to the ponder_sync.db and continually batch inserts required data into the blocks, logs, logFilterIntervals and transactions tables. It runs completely independently of the ponder core code and no ponder core code is modified. Following this alternative historical sync, running the main ponder process successfully starts indexing from this data.

The alternative to the RPC enabling this speed up is hypersync (disclaimer: I am part of the team). Its a fast flexible alternative to the RPC catered for data heavy use-cases such as indexing.

A deeper integration of hypersync as an additional alternative to RPC in the historical sync service in ponder core might allow Ponder users to achieve much quicker (>100x) historical sync times. This opt-in alternative could be useful for Ponder users who value or require this performance for their specific use-case. I completely understand this increases code complexity and maintenance. I was wondering if it might be worth it given the advantages it unlocks and curious to hear your thoughts and considerations.

Here is an example repo for you to try it out too: https://github.com/enviodev/friendtech-ponder-hypersync/tree/main
(keep in mind its currently bottlenecked by being a single thread process frequently blocked by sqlite inserts - this would change)

@moose-code moose-code changed the title [Idea] Supporting historical sync source alternative to RPC [Feature] Supporting historical sync source alternative to RPC May 16, 2024
@derekbar90
Copy link

This is super interesting!

@github-project-automation github-project-automation bot moved this from Todo to Done in Ponder Roadmap Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants