[Feature] Supporting historical sync source alternative to RPC #888

moose-code · 2024-05-16T12:15:00Z

Hi there!

I've been doing some experimenting with using an alternative data source (as opposed to the RPC) for the default friend-tech ponder indexer created in the pnpm create ponder flow.

Its an interesting case as there more than 6 million events to be indexed and from what I can tell it will take more than 15hours to sync using a paid RPC service. My experimentation goal was to try and populate the ponder_sync.db >100x faster (than using an RPC) and then being able to start the ponder service and allow it to index using this inserted data.

So far I've found some pretty interesting results:
1 - I am able to fetch all the required log, block and transaction data for the ponder_sync.db in this case (+6m events, blocks and txs) in around 12minutes from scratch.
2 - Writing to the sqlite db is currently the bottleneck, and seems to take in the order of 24minutes for the fetched data to be written and persisted to the db.

The experimental approach essentially opens a connection to the ponder_sync.db and continually batch inserts required data into the blocks, logs, logFilterIntervals and transactions tables. It runs completely independently of the ponder core code and no ponder core code is modified. Following this alternative historical sync, running the main ponder process successfully starts indexing from this data.

The alternative to the RPC enabling this speed up is hypersync (disclaimer: I am part of the team). Its a fast flexible alternative to the RPC catered for data heavy use-cases such as indexing.

A deeper integration of hypersync as an additional alternative to RPC in the historical sync service in ponder core might allow Ponder users to achieve much quicker (>100x) historical sync times. This opt-in alternative could be useful for Ponder users who value or require this performance for their specific use-case. I completely understand this increases code complexity and maintenance. I was wondering if it might be worth it given the advantages it unlocks and curious to hear your thoughts and considerations.

Here is an example repo for you to try it out too: https://github.com/enviodev/friendtech-ponder-hypersync/tree/main
(keep in mind its currently bottlenecked by being a single thread process frequently blocked by sqlite inserts - this would change)

The text was updated successfully, but these errors were encountered:

derekbar90 · 2024-06-12T14:52:06Z

This is super interesting!

github-project-automation bot added this to Ponder Roadmap May 16, 2024

github-project-automation bot moved this to Todo in Ponder Roadmap May 16, 2024

moose-code changed the title ~~[Idea] Supporting historical sync source alternative to RPC~~ [Feature] Supporting historical sync source alternative to RPC May 16, 2024

kyscott18 closed this as completed Oct 25, 2024

github-project-automation bot moved this from Todo to Done in Ponder Roadmap Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Supporting historical sync source alternative to RPC #888

[Feature] Supporting historical sync source alternative to RPC #888

moose-code commented May 16, 2024 •

edited

Loading

derekbar90 commented Jun 12, 2024

[Feature] Supporting historical sync source alternative to RPC #888

[Feature] Supporting historical sync source alternative to RPC #888

Comments

moose-code commented May 16, 2024 • edited Loading

derekbar90 commented Jun 12, 2024

moose-code commented May 16, 2024 •

edited

Loading