[DataAvailability] Fork Aware Execution Data processing #6900

peterargue · 2025-01-15T23:50:52Z

Problem Description

Execution Data is synced and indexed by edge nodes to support serving the Access API from local data. Currently, syncing starts when the block is sealed, which allowed us to ignore execution forks in the implementation since the state will never change by definition.

To reduce the latency from when the block is executed to when data is available via the API, we need to sync the data soon after the block is executed. Since the data has not been sealed at this point, it's possible for there to be execution forks (when Execution Nodes disagree on the result of executing a block). This means that the syncing and indexing logic will need to be made fork aware.

Potential Solution

To do this, we'll need to address the following:

Execution Sync
Currently, syncing waits for the block to be sealed. This could be updated to wait for N execution receipts with a matching ExecutionResult. The challenge will be to ensure that orphan'd execution data blobs are removed from the database in a timely manner. Ideally, we would avoid writing to the db until after the block was sealed, but this may be challenging since writes are handled by the external BitService module. It may be possible to accomplish it using a wrapped version of the db.
Indexing
Once the data is available locally, it is indexed. This means we don't need to do much work to index early, but we will need some way to roll back the indexed data if the indexed result is never sealed. Since execution forks are rare, we may want to index all forks, then prune the orphaned fork when sealing happens. This would cause the least amount of disruption.

For the indexing, we most likely will want to add a caching layer so that data is only persisted to the db once the block is sealed. this will make recovering from a crash much easier since only correct data is ever stored.
Config and API changes
Since we are now syncing and returning results based on unsealed data, we will need some operator level config to control how many receipts and which ENs to trust. The defaults are likely 2/any, but operators should get to choose.

Additionally, we should update the metadata returned for some endpoints to include a list of which ENs agreed on the result returned. Eventually, we may also want to allow clients to specify which/how many ENs must agree to return the result.
Streaming
Since the edge node may be streaming data to clients using unsealed data, we will need to design a mechanism to communicate to clients that there was an execution fork, and how to recover from it (e.g. rollback to height N, and restart the stream)

The text was updated successfully, but these errors were encountered:

peterargue added Epic Preserve Stale Bot repellent S-Access labels Jan 15, 2025

peterargue mentioned this issue Jan 16, 2025

PoC to allow indexing unsealed finalized execution results onflow/flow-evm-gateway#727

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DataAvailability] Fork Aware Execution Data processing #6900

[DataAvailability] Fork Aware Execution Data processing #6900

peterargue commented Jan 15, 2025 •

edited

Loading

[DataAvailability] Fork Aware Execution Data processing #6900

[DataAvailability] Fork Aware Execution Data processing #6900

Comments

peterargue commented Jan 15, 2025 • edited Loading

Problem Description

Potential Solution

peterargue commented Jan 15, 2025 •

edited

Loading