Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add eth_getRequiredBlockState method #455

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
335 changes: 335 additions & 0 deletions specs/required_block_state.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,335 @@
## `RequiredBlockState` specification

Specification of a data format that contains state required to
trace a single Ethereum block.

This is the format of the data returned by the `eth_getRequiredBlockState` JSON-RPC method.

## Table of Contents

- [`RequiredBlockState` specification](#requiredblockstate-specification)
- [Table of Contents](#table-of-contents)
- [Abstract](#abstract)
- [Motivation](#motivation)
- [Overview](#overview)
- [General Structure](#general-structure)
- [Notation](#notation)
- [Endianness](#endianness)
- [Constants](#constants)
- [Variable-size type parameters](#variable-size-type-parameters)
- [Definitions](#definitions)
- [`RequiredBlockState`](#requiredblockstate)
- [`CompactEip1186Proof`](#compacteip1186proof)
- [`Contract`](#contract)
- [`TrieNode`](#trienode)
- [`RecentBlockHash`](#recentblockhash)
- [`CompactStorageProof`](#compactstorageproof)
- [Algorithms](#algorithms)
- [`construct_required_block_state`](#construct_required_block_state)
- [`get_state_accesses`](#get_state_accesses)
- [`get_proofs`](#get_proofs)
- [`get_block_hashes`](#get_block_hashes)
- [`use_required_block_state`](#use_required_block_state)
- [`verify_required_block_state`](#verify_required_block_state)
- [`trace_block_locally`](#trace_block_locally)
- [`compression_procedure`](#compression_procedure)
- [Security](#security)
- [Future protocol changes](#future-protocol-changes)
- [Canonicality](#canonicality)
- [Post-block state root](#post-block-state-root)


## Abstract

An Ethereum block returned by `eth_getBlockByNumber` can be considered a program that executes
a state transition. The input to that program is the state immediately prior to that block.
Only a small part of that state is required to run the program (re-execute the block).
The state values can be accompanied by merkle proofs to prevent tampering.

The specification of that state (values and proofs as `RequiredBlockState`) facilitates
data transfer between two parties. The transfer represents the minimum amount of data
required for the holder of an Ethereum block to re-execute that block.

Re-execution is required for basic accounting (examination of the history of the global
shared ledger). Trustless accounting of single Ethereum blocks allows for lightweight
distributed block exploration.


## Motivation

State is rooted in the header. A merkle multiproof for all state required for all
transactions in one block enables is sufficient to trace any historical block.

In addition to the proof, BLOCKHASH opcode reads are also included.

Together, anyone with an ability to verify that a historical block header is canonical
can trustlessly trace a block without posession of an archive node.

The format of the data is deterministic, so that two peers creating the same
data will produce identical structures.

The primary motivation is that data may be distributed in a peer-to-peer content delivery network.
This would represent the state for a sharded archive node, where users may host subsets of the
data useful to them.

A secondary benefit is that traditional node providers could serve users the ability to
re-execute a block, rather than provide the result of re-execution. Transfer
of `RequiredBlockState` is approximately 167kb/Mgas (~2.5MB per block). Transfer of
a `debug_TraceBlock` result is on the order of hundreds of megabytes per block with memory
disabled, and with memory enabled can be tens of gigabytes. Local re-execution with an EVM
implementation of choice can produce the identical re-execution (including memory or custom
tracers), and can be processed and discarded on the fly.

## Overview

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
"RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted
as described in RFC 2119 and RFC 8174.

### General Structure

The `RequiredBlockState` consists of account state values as Merkle proofs, contract bytecode
and recent block hashes.

### Notation
Code snippets appearing in `this style` are to be interpreted as Python 3 pseudocode. The
style of the document is intended to be readable by those familiar with the
Ethereum consensus [https://github.com/ethereum/consensus-specs](https://github.com/ethereum/consensus-specs)
and Simple Serialize (SSZ) ([https://github.com/ethereum/consensus-specs/blob/dev/ssz/simple-serialize.md](https://github.com/ethereum/consensus-specs/blob/dev/ssz/simple-serialize.md))
specifications.

Where a list/vector is said to be sorted, it indicates that the elements are ordered
lexicographically when in hexadecimal representation (e.g., `[0x12, 0x3e, 0xe3]`) prior
to conversion to ssz format. For elements that are containers, the ordering is determined by
the first element in the container.

### Endianness

Big endian form is used as most data relates to the Ethereum execution context.

## Constants

### Variable-size type parameters

Helper values for SSZ operations. SSZ variable-size elements require a maximum length field.

Most values are chosen to be the approximately the smallest possible value.

| Name | Value | Description |
| - | - | - |
| MAX_BLOCKHASH_READS_PER_BLOCK | uint16(256) | A BLOCKHASH opcode may read up to 256 recent blocks |
| MAX_BYTES_PER_NODE | uint16(32768) | - |
| MAX_BYTES_PER_CONTRACT | uint16(32768) | - |
| MAX_CONTRACTS_PER_BLOCK | uint16(2048) | - |
| MAX_NODES_PER_BLOCK | uint16(32768) | - |
| MAX_ACCOUNT_PROOFS_PER_BLOCK | uint16(8192) | - |
| MAX_STORAGE_PROOFS_PER_ACCOUNT | uint16(8192) | - |

## Definitions

### `RequiredBlockState`

The entire `RequiredBlockState` data format is represented by the following (SSZ-encoded and
snappy-compressed) container.

All trie nodes (account and storage) are aggregated for deduplication and simplicity.
They are located in the `trie_nodes` members.
A "compact" proof consists only of the root hash for that trie and the information required
for computing the trie path. The trie nodes can then be traversed by locating the first
node using the root hash and starting the traversal.

The proof data represents values in the historical chain immediately prior to the execution of
the block (sometimes referred to as "prestate"). That is, `RequiredBlockState` for block `n`
contains proofs rooted in the state root of block `n - 1`.

```python
class RequiredBlockState(Container):
#sorted (by address)
compact_eip1186_proofs: List[CompactEip1186Proof, MAX_ACCOUNT_PROOFS_PER_BLOCK]
#sorted
contracts: List[Contract, MAX_CONTRACTS_PER_BLOCK]
#sorted
trie_nodes: List[TrieNode, MAX_NODES_PER_BLOCK]
#sorted (by block number)
block_hashes: List[RecentBlockHash, MAX_BLOCKHASH_READS_PER_BLOCK]
```
The `RequiredBlockState` is compressed using snappy encoding (see algorithms section). The
`eth_getRequiredBlockState` JSON-RPC method returns the SSZ-encoded container with snappy encoding.

### `CompactEip1186Proof`

Represents the proof data whose root is the state root in the block header of the preceeding block.

The account proof is obtained by calculating the account hash and traversing nodes in the
`RequiredBlockState` container.

```python
class CompactEip1186Proof(Container):
address: Vector[uint8, 20]
balance: List[uint8, 32]
code_hash: Vector[uint8, 32]
nonce: List[uint8, 8]
storage_hash: Vector[uint8, 32]
#sorted
storage_proofs: List[CompactStorageProof, MAX_STORAGE_PROOFS_PER_ACCOUNT]
```

### `Contract`

An alias for contract bytecode.
```python
Contract = List[uint8, MAX_BYTES_PER_CONTRACT]
```

### `TrieNode`

An alias for a node in a merkle patricia proof.

Merkle Patricia Trie proofs consist of a list of witness nodes that correspond to each trie node that consists of various data elements depending on the type of node (e.g. blank, branch, extension, leaf). When serialized, each witness node is represented as an RLP serialized list of the component elements.

```python
TrieNode = List[uint8, MAX_BYTES_PER_NODE]
```

### `RecentBlockHash`

A block hash accessed by the "BLOCKHASH" opcode.
```python
class RecentBlockHash(Container):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be made verifiable by including the whole header here.

Copy link
Contributor Author

@perama-v perama-v Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on what you mean here?

The assumption here is that a user has a mechanism to verify the canonicality of blockhashes.

More notes here: https://github.com/perama-v/archors#blockhash-opcode

One design goal is to keep RequiredBlockState as small as possible. This is because a static collection of all RequiredBlockState's would allow for a distributed archive node. The current estimate puts the size at 30TB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my point addresses the following fault case you describe in your notes:

A block hash is wrong: the portal node can audit all block hashes against its master accumulator prior to tracing.

If we want to solve it at the RPC level: instead of sending only block number and block hash, send the whole block header. Then the verifier can hash the header (which contains the number) and be sure about the hash.

I understand the size concern. However if the clients have to audit an external source anyway it might be worth it. Otherwise if in most use-cases the client has easy access to latest 256 blocks then we can leave out this info from eth_getRequiredBlockState.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "wrong" hash refers to a hash that is not canonical. So, the wrong hash could be a correctly computed hash of a noncanonical block. E.g. a fabricated block or an uncle block. Having the whole block included doesn't get one closer to verifying canonicality.

block_number: List[uint8, 8]
block_hash: Vector[uint8, 32]
```

### `CompactStorageProof`

The account proof is obtained by calculating the hash of the key and traversing nodes in the
`RequiredBlockState` container.

```python
class CompactStorageProof(Container):
key: Vector[uint8, 32]
value: List[uint8, 8]
```

## Algorithms

This section contains descriptions of procedures relevant to `RequiredBlockState`, including their
production (`construct_required_block_state`) and use (`use_required_block_state`).

### `construct_required_block_state`

For a given block, `RequiredBlockState` can be constructed using existing JSON-RPC methods by
using the following algorithms/steps:
1. `get_state_accesses` algorithm
2. `get_proofs`
3. `get_block_hashes`
4. Create the `RequiredBlockState` SSZ container
5. Use `compression_procedure` to compress the `RequiredBlockState`

### `get_state_accesses`

Call `debug_TraceBlock` with the prestate tracer, record key/value pairs where
they are first encountered in the block.

```
curl -X POST -H "Content-Type: application/json" --data '{"jsonrpc": "2.0", "method": "debug_traceBlock", "params": ["finalized", {"tracer": "prestateTracer"}], "id":1}' http://127.0.0.1:8545 | jq
```
This will return state objects consisting of a key (account address), and value (state, which
may include contract bytecode and storage key/value pairs). See two objects for reference:
```json
[
"0x58803db3cc22e8b1562c332494da49cacd94c6ab": {
"balance": "0x13befe42b38a40",
"nonce": 54
},
"0xae7ab96520de3a18e5e111b5eaab095312d7fe84": {
"balance": "0x4558214a60e751c3a",
"code": "0x608060/* Snip (entire contract bytecode) */410029",
"nonce": 1,
"storage": {
"0x1b6078aebb015f6e4f96e70b5cfaec7393b4f2cdf5b66fb81b586e48bf1f4a26": "0x0000000000000000000000000000000000000000000000000000000000000000",
"0x4172f0f7d2289153072b0a6ca36959e0cbe2efc3afe50fc81636caa96338137b": "0x000000000000000000000000b8ffc3cd6e7cf5a098a1c92f48009765b24088dc",
"0x644132c4ddd5bb6f0655d5fe2870dcec7870e6be4758890f366b83441f9fdece": "0x0000000000000000000000000000000000000000000000000000000000000001",
"0xd625496217aa6a3453eecb9c3489dc5a53e6c67b444329ea2b2cbc9ff547639b": "0x3ca7c3e38968823ccb4c78ea688df41356f182ae1d159e4ee608d30d68cef320"
}
},
...
]
```

### `get_proofs`

Call the `eth_getProof` JSON-RPC method for each state key (address) returned by the
`get_state_accesses` algorithm, including
storage keys if appropriate.

The block number used is the block prior to the block of interest (state is stored as post-block
state).

For all account proofs, aggregate and sort the proof nodes and represent each proof as a list of
indices to those nodes. Repeat for all storage proofs.

### `get_block_hashes`

Call `debug_TraceBlock` with the default tracer, record any use of the "BLOCKHASH" opcode.
Record the block number (top of stack in the "BLOCKHASH" step), and the block hash (top
of stack in the subsequent step).

### `use_required_block_state`

1. Obtain `RequiredBlockState`, for example by calling `eth_getRequiredBlockState`
2. Use `compression_procedure` to decompress the `RequiredBlockState`
3. `verify_required_block_state`
4. `trace_block_locally`

### `verify_required_block_state`

Check block hashes are canonical such as a node or against an accumulator of canonical
block hashes. Check merkle proofs in the required block state.

### `trace_block_locally`

Obtain a block (`eth_getBlockByNumber` JSON-RPC method) with transaction bodies. Use an EVM
and load it with the `RequiredBlockState` and the block. Execute
the transactions in the block and observe the trace.

### `compression_procedure`

The `RequiredBlockState` returned by the `eth_getRequiredBlockState` JSON-RPC method is
compressed. Snappy compression is used ([https://github.com/google/snappy](https://github.com/google/snappy)).

The encoding and decoding procedures are the same as that used in the Ethereum consensus specifications
([https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#ssz-snappy-encoding-strategy](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#ssz-snappy-encoding-strategy)).

For encoding (compression), data is first SSZ-encoded and then snappy-encoded.
For decoding (decompression), data is first snappy-decoded and then SSZ-decoded.

## Security

### Future protocol changes

Merkle patricia proofs may be replaced by verkle proofs after some hard fork.
This would not invalidate `RequiredBlockState` data prior to that fork.
The new proof format could be added to this specification for data after that fork.

### Canonicality

A recipient of `RequiredBlockState` must check that the blockhashes are part of the real
Ethereum chain history. Failure to verify (`verify_required_block_state`) can result in invalid
re-execution (`trace_block_locally`).

### Post-block state root

A user that has access to canonical block hashes and a sound EVM implementation has strong
guarantees about the integrity of the block re-execution (`trace_block_locally`).

However, there is no guarantee to be able to compute a new block state root for this post-execution
state. For example, with the aim to check against the state root in the block header of that block
and thereby audit the state changes that were applied.

This is because the state changes may involve an arbitrary number of state deletions. State
deletions may change the structure of the merkle trie in a way that requires knowledge of
internal nodes that are not present in the proofs obtained by `eth_getProof` JSON-RPC method.
Hence, while the complete post-block trie can sometimes be created, it is not guaranteed.
Comment on lines +326 to +333
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I was worried about this.

Being able to consistently verify the post-state hash doesn't really feel optional to me. I can't trust the result of a local block execution that doesn't also prove that it got the same result as the network did. In the context of the Portal Network, nodes must validate data and only gossip data that we can prove locally.

EVM execution engines should be able to handle the case of missing trie nodes at the refactor, and returning the information needed to collect the data. They must handle the case, if they want to run against a partial trie database, like if they're running Beam Sync.

I am only familiar enough with py-evm to give the example:
https://github.com/ethereum/py-evm/blob/d751dc8c9c8199a16043a483b19c9f4d7a592202/eth/db/account.py#L605-L661

The MissingTrieNode exceptions are the relevant moments when the EVM realizes that it's missing some intermediate nodes that are required to prove the final state root, if you're only running with the state proof as defined in the current spec.

As a prototype, I suppose you could literally run py-evm with the unverified proofs, then retrieve the missing trie nodes over devp2p one by one (there usually aren't too many, from what I've seen).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. then retrieve the missing trie nodes over devp2p one by one

For the missing trie node(s), I don't see a clear mechanism to obtain that node. For removed nodes that require knowledge of a sibling node, how could that sibling node be obtained? As it is mid-traversal, the terminal keys of the affected sibling are not trivially known. So eth_getProof(block_number, key_involving_sibling_node) cannot be used. A new method get_trie_node_at_block_height(block_number, trie_node_hash) could be written for a node, but this is nontrivial. More context here:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. nodes must validate data and only gossip data that we can prove locally.

This is preserved (gossiped data is validated). The gossiped data is the merkle proofs of the state, so this is validated, and the block is also validated.

So the source of error here is a bug in the EVM implementation.

flowchart TD
    Block[Block, secured by header: Program to run] -- gossiped --> Pre
    State[State, secured by merkle proofs: Input to program] -- gossiped --> Pre
    Pre[Program with inputs] -- Load into EVM environment -->  EVM[EVM executes]
    EVM -- bug here --> Post[Post block state]
    Post -.-> PostRoot[Post-block state root useful for these bugs]
    EVM -- bug here --> Trace[debug_traceBlock output]
Loading

I agree EVM bugs are possible and having the post-state root would be nice.

However:

  • EVM impls are often shared across different client types, reducing bug risk.
  • Even having the post-block state does not guarantee that debug_traceBlock output is correct. It contains many
    details that are not covered by a post-block state root. So one still needs to check that the EVM doesn't have bugs.
  • Test suites can be run to compare debug_TraceBlock against the same result from an archive node (or different EVM implementation hooked into the portal network). This protects against EVM errors that result in bad state, and errors that result in bad traces.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I was worried about this.

Being able to consistently verify the post-state hash doesn't really feel optional to me. I can't trust the result of a local block execution that doesn't also prove that it got the same result as the network did. In the context of the Portal Network, nodes must validate data and only gossip data that we can prove locally.

EVM execution engines should be able to handle the case of missing trie nodes at the refactor, and returning the information needed to collect the data. They must handle the case, if they want to run against a partial trie database, like if they're running Beam Sync.

I am only familiar enough with py-evm to give the example: https://github.com/ethereum/py-evm/blob/d751dc8c9c8199a16043a483b19c9f4d7a592202/eth/db/account.py#L605-L661

The MissingTrieNode exceptions are the relevant moments when the EVM realizes that it's missing some intermediate nodes that are required to prove the final state root, if you're only running with the state proof as defined in the current spec.

As a prototype, I suppose you could literally run py-evm with the unverified proofs, then retrieve the missing trie nodes over devp2p one by one (there usually aren't too many, from what I've seen).

Ive proposed an idea here and believe our proposal on ZK proofs of the last state can indeed can help address the challenge mentioned. ZK proofs, specifically Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (ZK-SNARKs), have the potential to provide proofs of complex computations and statements: sogolmalek/EIP-x#6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... our proposal on ZK proofs of the last state can indeed can help address the challenge mentioned

Let me see if I understand your proposal. The data that a peer receives could consist of the:

  • (existing) The block pre-state (state required for the block, secured by merkle root in prior block and a header accumulator)
  • (existing) The block (to allow the user to replay the block for any purpose)
  • (new) A proof of block execution (ZK-EVM) consisting of a ZK-SNARK proof for the set of block post-state values (values that were accessed by the block).

The user re-executes the block and arrives at post-block state values. Those are compared to the values in the ZK proof. If they are the same, the chance that the EVM and the ZK-EVM both having a bug is unlikely and the state transition is likely sound.

So the ZK proof is equivalent to replaying the block using a different EVM implementation. That is, the same as getting the post-block state from a Rust implementation and comparing to the state produced by a Go implementation.

In this respect, the presence of the ZK proof does not seem to introduce additional safety. Perhaps I overlooked a component?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed arguments. I'm just trying to make sure i understand points well and hope i didnt go far wrong with my following arguments. love to learn more here.

Our approach centers around the principle of minimizing data reliance while ensuring the accuracy and reliability of state transitions.While your understanding of our proposal is mostly accurate, there are a few points that could be considered when evaluating the introduced ZK proofs:
The process of re-executing the block on a different EVM implementation (Rust vs. Go, as you mentioned) can be cumbersome and resource-intensive.
another point is Attack Vectors and Determinism: While replaying the block using different EVM implementations is conceptually similar, it introduces the potential for inconsistencies between different implementations due to subtle differences in execution logic or determinism. furthermore, I believe ZK can ensure way more Efficiency and Scaling. Re-executing the block using different EVM implementations might be feasible on a small scale, but it becomes much more challenging as the network scales and the number of transactions increases. ZK proofs, on the other hand, can be generated and verified more efficiently, making them more suitable for large-scale systems.

As you've mentioned, the problem with key deletions is that sometimes sibling nodes in the trie are required to reconstruct the trie structure for verification purposes. These sibling nodes might not be available through the eth_getProof data, leading to incomplete trie reconstructions.

I think, a ZK proof of the last state involves generating a cryptographic proof that certifies the correctness of the entire state transition process, including deletions and modifications, can help. This proof can be designed to include information about sibling nodes that are required for verification. In essence, the ZK proof encapsulates the entire state transition, and by design, it must account for all the necessary data, including sibling nodes, to be valid.

Also By obtaining and validating a ZK proof of the last state, we can ensure that all required data for verifying the state transition, including missing sibling nodes, is included in the proof. This provides a complete and comprehensive validation mechanism that mitigates the challenge posed by missing nodes.

Copy link
Contributor Author

@perama-v perama-v Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional context is that eth_getRequiredBlockState is designed to provide the minimum information required to re-execute an old block. The goal is to be able to download individual blocks and re-execute them in order to inspect every EVM step in that block. That is, the goal is to run the EVM (to gain historical insight).

With a ZK EVM, one can demonstrate to a peer that the block post-state is valid with respect to a block. This means that they do not have to re-execute the EVM themselves. This is beneficial for a light client that wants to trustlessly keep up to date with the latest Ethereum state. That is, the goal is to not run the EVM (to save resources).

Copy link
Contributor

@s1na s1na Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any key that has an inclusion proof in n - 1, and an exclusion proof in block n, retain the proof for that key.
4. Store these additional exclusion proofs in the RequiredBlockState data structure.

I think the idea is a good direction, but after discussing with @gballet I have the feeling sending the exclusion proofs themselves are not enough for the user to perform deletion locally. Because the sibling of the node being deleted could be a subtrie. So the server should use this approach (or another) to give the full prestate that is required for execution of the block, this means figuring out which extra nodes are needed to be returned for such a deletion.

Geth has the prestateTracer which I noticed suffers from the same issue after reading up on this ticket. I'd be keen to fix it. Nevermind, the prestateTracer doesn't actually return intermediary nodes, only accounts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At present this spec contains sufficient data to execute the block, but not update the proofs to get the post-block root.

For context, the main issue is that sometimes the trie depth is changed, which impacts cousin-nodes. More on this here: https://github.com/perama-v/archors/blob/main/crates/multiproof/README.md

To enable that, I have been prototyping a mechanism that allows this using only available JSON-RPC methods (eth_getProof) without hacking an execution client. This method calls get_proof on the subsequent block gets these edge-case nodes and tacks them into RequiredBlockState as "oracle data". The data allows the trie update to complete and the post block root compute, verifying that the oracle data was in fact correct.

I learned that @s1na has taken a different approach, which is to modify Geth to record all trie nodes touched during a block execution (including post-root computation). That method is sufficient to get trie nodes required to compute the post-block root (but requires modifying the execution client). It is a nice approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially the oracle approach says: Look, some changes to the surrounding trie will happen, and they may be complex/extensive. One can "look in to the future" at this exact point in the traversal and just get the nodes for the affected part (leafy-end) of the traversal. The details of the surrounding trie don't need to be computed and aren't important because we can check that what we got from the future was in fact correct.

Bit of a hacky approach. I can see that this might be difficult to implement if not using eth_getProof.



15 changes: 15 additions & 0 deletions src/eth/state.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,18 @@
name: Account
schema:
$ref: '#/components/schemas/AccountProof'
- name: eth_getRequiredBlockState
summary: Returns state required to re-execute a block.
description: Returns the RequiredBlockState (SSZ- and snappy-encoded) which contains block prestate and proof data.
params:
- name: Block
required: true
schema:
$ref: '#/components/schemas/BlockNumberOrTag'
result:
name: Required block state
description: The RequiredBlockState (SSZ- and snappy-encoded) which contains block prestate and proof data.
schema:
oneOf:
- $ref: '#/components/schemas/notFound'
- $ref: '#/components/schemas/bytes'

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
>> {"jsonrpc":"2.0","id":1,"method":"eth_getRequiredBlockState","params":["0x5f5e100"]}
<< {"jsonrpc":"2.0","id":1,"result":null}
3 changes: 3 additions & 0 deletions wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,13 @@ bodiesbyhashv
bytecode
configurationv
crypto
dev
eip
endian
enum
eth
ethereum
EVM
interop
json
mempool
Expand All @@ -33,6 +35,7 @@ uint
updatedv
url
validator
verkle
wei
yaml
yParity
Expand Down
Loading