Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: remove the trace_decoder crate #659

Open
wants to merge 22 commits into
base: develop
Choose a base branch
from

Conversation

0xaatif
Copy link
Contributor

@0xaatif 0xaatif commented Sep 25, 2024

mostly file renames or plain transplants except:

  • ::trace_decoder::observer has just been folded into ::zero::intra_block_tries
  • ::trace_decoder::{type1,type2} have been made submodules of ::zero::wire_tries
  • redact trace_decoder from the README
  • stopped setting RUSTFLAGS in ci jobs so we have one place where they're defined, and I don't waste a day debugging esoteric linker issues

@github-actions github-actions bot added crate: trace_decoder Anything related to the trace_decoder crate. crate: zero_bin Anything related to the zero-bin subcrates. labels Sep 25, 2024
Copy link
Collaborator

@Nashtare Nashtare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the README needs an update as well (it's mentioning the trace_decoder + includes it in the mermaid graph)

.github/labeler.yml Show resolved Hide resolved
@0xaatif
Copy link
Contributor Author

0xaatif commented Sep 25, 2024

@Nashtare / @muursh we'll need to update the required checks (also why do we have those if someone can just [skip-ci]?)

@Nashtare
Copy link
Collaborator

we'll need to update the required checks

Will update them once this lands (@muursh and I got rights to bypass PR requirements to merge this)

(also why do we have those if someone can just [skip-ci]?)

Good question, I'm not even sure this ever got used as it aims to be, probably best to simply remove. I think this came from an initial (long ago) refactoring of the CI by Leo if I'm correct, from which it got kept around.

Copy link
Collaborator

@Nashtare Nashtare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't try to argue against this, but I'll just reiterate once that I don't believe merging literally everything in a single crate is an ideal approach.

I think it is meaningful to keep the distinction between:

  • higher application level
  • mid level library (trace_decoder) parsing payloads into prover inputs
  • backend library (evm_arithmetization) in charge of the actual proving behind the scene

There are pros and cons in both approaches, none is perfect, but for the time being, this feels to bring more downsides than upsides. This being said, if @muursh and @atanmarko want to approve this, then let's merge it.

Also as a side note, as it looks like this may be the following target, I want to clarify that I am strongly against merging evm_arithmetization into zero crate and as codeowner, will reject any attempt to do so for the time being.

README.md Show resolved Hide resolved
@0xaatif
Copy link
Contributor Author

0xaatif commented Sep 26, 2024

I'll just reiterate once that I don't believe merging literally everything in a single crate is an ideal approach.

I believe we're on similar pages since our extensive discussion in #zero-dev on Sep 13th.

To be clear, my motivation for this change is summarized in this excerpt from our conversation:

Trace decoder does two completely unrelated things:

  • generate intra-block tries
  • parse (two, soon to be three) wire formats of different RPC nodes into tries

zero knows more about the latter, so the wire parsing should live there, and the former then amounts to basically a single function, which I don't think deserves its own package.

This unblocks my forward progress on #93.

but for the time being, this feels to bring more downsides than upsides

Out of curiosity, could you elaborate on these?


I think trace_decoder as a meaningful "mid-level" library is a mirage.

Also as a side note, as it looks like this may be the following target, I want to clarify that I am strongly against merging evm_arithmetization into zero crate

Thank you for your clarification.
I'm not interested in this at this time.
I'm happy that the cryptographers on our team work mostly on one crate, and "rust" devs on the other.
I also think it's a sensible crate boundary for that reason.

@Nashtare
Copy link
Collaborator

To be clear, my motivation for this change is summarized in this excerpt from our conversation:

Trace decoder does two completely unrelated things:

  • generate intra-block tries
  • parse (two, soon to be three) wire formats of different RPC nodes into tries

zero knows more about the latter, so the wire parsing should live there, and the former then amounts to basically a single function, which I don't think deserves its own package.

On "zero knows more about the latter, so the wire parsing should live there":
I don't think zero really knows about the underlying trie structure, especially if we focus (as our main product target) about Jerigon payloads which are compact encodings, effectively only pure byte blobs until the decoder parses them, regardless of whether we target a type 1 or type 2.

but for the time being, this feels to bring more downsides than upsides

Out of curiosity, could you elaborate on these?

Sure, my main interests with respect to separate crate with the layout I mentioned above are:

  • modularization: we can dev / test / reuse code independently, with separation of concerns in a cleaner fashion that in a giant crate
  • compilation time: especially with incremental compilation, this is really hindering developer experience when switching between branches. As an example, evm_arithmetization currently has 192 dependencies, zero has close to 600, and I probably switch between 6 / 8 branches a day at the very least.
  • granularity in testing: we've already mentioned that current CI was poor and unsatisfactory, but in an ideal world, separate crates is better to this extent, so that each can have its own tests / CI. A single crate makes CI slower and harder to isolate failures.
  • versioning: fairly early as of today, but once we get closer to production, we will want a clear versioning process and dependency management. Having everything bundled together makes it harder, and if we respect Semver, PATCH / MINOR increments are seemless across the other crates of the workspace.
  • collabs: very little weight here, but if a team was to be interested in a piece of this stack (say prover / decode / mpt_trie, it's much easier / meaningful to have open-source collab on separate distinct crates than a single huge one).

@0xaatif 0xaatif mentioned this pull request Sep 26, 2024
@@ -34,7 +32,7 @@ fn criterion_benchmark(c: &mut Criterion) {
block_trace,
other_data,
}| {
trace_decoder::entrypoint(
zero::intra_block_tries::entrypoint(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating the point from some previous PR that entrypoint is not a good API function name.

@@ -1,3 +1,42 @@
//! An _Ethereum Node_ executes _transactions_ in _blocks_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me intra_block_tries is a bit misleading and out of context name for a module for somebody looking at the project for the first time. entrypoint is not just about intermediate tries, it processes node tracer input and metadata and prepares the whole input for the prover. Not sure what would be the good alternative - e.g. witness_processor, prover_input, preprocessor or something else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, for somebody looking at what the leader does, crate::prover_input::generate_inputs(block_trace, block_data) would be much more meaningful than crate::intra_block_tries::entrypoint()


/// Like `#[serde(with = "::hex")]`, but tolerates and emits leading `0x`
/// prefixes
mod hex {
Copy link
Member

@atanmarko atanmarko Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel there is a lot of potential for some zero/src/common folder/submodule. Looking at the moment at the 20 files (submodules) in the zero crate it is not obvious what is important, what is the core of the zero logic and purpose. I think we have gone too far in the "un-cratezation".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minimum fs.rs, env.rs, parsing.rs, tracing.rs, debug_utils.rs and this hex module should be moved to the zero/src/common/.

@atanmarko
Copy link
Member

Btw. to somehow make the compromise between the @Nashtare and @0xaatif viewpoint, I would keep the ex trace decoder a bit separated from the existing zero code, having all ex trace decoder related code in the zero::prover_input submodule. I am not against this PR, but I don't see the reason for wire_tries to be first level zero public submodule, since only intra_block_tries` is using it.

Literally, existing zero calls only one API function in the trace decoder (entrypoint), and uses exported BlockTrace and OtherBlockData. I don't see the reason for their tight coupling on the organizational level (what @Nashtare is referring as middle-ware separation).

@muursh
Copy link
Member

muursh commented Sep 26, 2024

Also as a side note, as it looks like this may be the following target, I want to clarify that I am strongly against merging evm_arithmetization into zero crate and as codeowner, will reject any attempt to do so for the time being.

100% agreed.

I don't see the reason for wire_tries to be first level zero public submodule

I agree with Marko here. I'm not sure I see the argument right now either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci crate: trace_decoder Anything related to the trace_decoder crate. crate: zero_bin Anything related to the zero-bin subcrates.
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants