Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The case for dropping JUMPF and non-returning functions #42

Open
gumb0 opened this issue Jan 16, 2024 · 13 comments
Open

The case for dropping JUMPF and non-returning functions #42

gumb0 opened this issue Jan 16, 2024 · 13 comments

Comments

@gumb0
Copy link
Contributor

gumb0 commented Jan 16, 2024

Main use case for JUMPF was being able to call a non-returning helper without the requirement to have equal stack height at each call site and without the need to POP extra items before calling.

In the current spec:

  1. If such helper is implemented inside caller section (without CALLF/JUMPF), it is allowed to call it from different stack heights.

  2. If such helper is in a separate section, we can make it work without the need for non-returning flag and JUMPF: non-returning functions are declared with 0 outputs, and the requirement to not POP extra items is achieved with CALLF STOP or CALLF INVALID sequence.

The difference with JUMPF would be 4 bytes of code instead of 3 bytes and an item pushed into call stack at run-time.

cc @ekpyron @charles-cooper

@charles-cooper
Copy link
Contributor

what is the current requirement for CALLF to pop extra items?

@gumb0
Copy link
Contributor Author

gumb0 commented Jan 16, 2024

what is the current requirement for CALLF to pop extra items?

No such requirement for CALLF, but there is one for JUMPF to returning functions, it doesn't allow unbalanced stack, same as RETF.

So if we replaced JUMPF with CALLF RETF sequence - that would require popping extra items.

@charles-cooper
Copy link
Contributor

hmm. it seems JUMPF is useful here for "stack-unwinding" type operations. just brainstorming here but maybe there could be another type (marked with a sentinel outputs value) of code section which when you CALLF it basically works like the currently spec'ed JUMPF?

@rakita
Copy link
Contributor

rakita commented Jan 17, 2024

Asked this in discord, but sharing it here for archives
"Hey I still dont understand the benefit of JUMPF. For non-returning function wouldn't that be just ordinary CALLF with zero output, what are we optimizing with JUMPF and how would alternative without JUMPF look like?"

@ekpyron
Copy link

ekpyron commented Jan 17, 2024

One of the main arguments for JUMPF was code deduplication. Smart contract functions involve an abundance of revert conditions, each of which potentially supposed to emit one of a small number of revert messages indicating error conditions. Previous versions of the specification did not allow to deduplicate jumps to such reverts within a function at least without stack cleanup (which itself significantly inflate code size to a degree that should not be underestimated) due to the requirement to have equal stack heights (this seems to be mitigated in the current spec), but IIRC also required stack cleanup on termination, so even e.g. in a CALLF STOP scenario (I may be wrong about that, though).

In any case, it seems like the latest spec mitigates some of these very strong points in favour of JUMPF, but there is still a case to be made for JUMPF:

  • It shouldn't be underestimated how prevalent small non-returning code paths are in smart contract code - and the pre-EOF-EVM has an important property here: by simply collapsing jump targets, code deduplication in cases like this is free in the current EVM (zero gas difference, unconditional code size gain). We need to be careful for EOF not to become a regression here that both complicates code deduplication by requiring more complex tradeoffs and by generally inflating code size. Allowing cross-function deduplication of non-returning code paths with JUMPF, but also of returning code paths (functions with a similar return value structure tend to share cleanup code that can be deduplicated using JUMPF - I'm not sure, but that may be what @charles-cooper had in mind with "stack-unwinding" type operations; deduplication here pre-EOF is also "free" without involving any tradeoff) may help towards that. In this context the expectation was also that JUMPF would in fact be cheaper than CALLF (due to not having to manipulate the return stack).
  • JUMPF enables tail-call-optimizations, i.e. the unlimited continuation of a function by handing over execution to another function at minimal cost and without return stack depth limitation.

That being said, I haven't had the time yet for a full analysis of the impact of JUMPF in the latest revision of the specification; it's definitely greater than zero, though.

@charles-cooper
Copy link
Contributor

charles-cooper commented Jan 17, 2024

but IIRC also required stack cleanup on termination, so even e.g. in a CALLF STOP scenario (I may be wrong about that, though).

i think this is mitigated by changing the validation rules for non-returning code sections. so if you have a code section foo <non returning> <code: ... REVERT, this can be CALLFed and behaves like the current JUMPF to non-returning function.

I'm not sure, but that may be what @charles-cooper had in mind with "stack-unwinding" type operations

yes -- one currently useful case for JUMPF is either shared cleanup blocks for different subroutines or exception handling. like an exception handling mechanism can be implemented by JUMPFing to a shared block which knows how to propagate any exception handling data structures or knows how to halt propagation (catch the exception).

JUMPF enables tail-call-optimizations, i.e. the unlimited continuation of a function by handing over execution to another function at minimal cost and without return stack depth limitation.

this is one thing i'm not convinced is super useful about JUMPF. tail call optimization can be implemented in compilers using regular jumps (at least for functions which recurse into themselves -- for corecursive, i haven't analyzed it yet but i think it is the same).

We need to be careful for EOF not to become a regression here that both complicates code deduplication by requiring more complex tradeoffs and by generally inflating code size.

i agree fully here. actually i think there is a strong case to be made for bringing back a single global code section a la EIP-2315. i have been told that EIP-2315 was infeasible or doesn't address certain use cases, but i don't fully understand what the issues are here.

@ekpyron
Copy link

ekpyron commented Jan 17, 2024

this is one thing i'm not convinced is super useful about JUMPF. tail call optimization can be implemented in compilers using regular jumps (at least for functions which recurse into themselves -- for corecursive, i haven't analyzed it yet but i think it is the same).

Well, that restricts you to a single function frame, so it would require to inline the entire graph of any corecursion. And tail calls are not necessarily recursive - in the end it's just the more general case of "shared cleanup blocks" for cases in which more logic is shared among functions.

In any case: just to be clear about that: at least for us in Solidity, assuming the weakened stack validations that the current version of the spec now involves, our last implementation of a previous in these aspects very similar specification overall still resulted in a net win not only in gas, but also in code size due to independent savings e.g. of jumpdests. So I don't think there is a strong case for a radical change as towards a single global code section, especially since a clear split into function sections does have advantages in terms of simpler analyzability. What we're talking about here with JUMPF as far as I'm concerned are minor tweaks to ensure the minimal amount of concessions to code deduplication compared to the current EVM - but also to be clear: if really need be, we could work without JUMPF - I'd just argue that it does remain useful despite the more relaxed stack validation.

@rakita
Copy link
Contributor

rakita commented Jan 20, 2024

JUMPF imo is the only EIP that is questionable, other EIP's have a stronger purpose and the reason is obvious as they solve the problem. This seems not the case for JUMPF.

We of course don't want to introduce regression but on the other hand we don't want to include something that pans up not useful or can be done in a different maybe simpler way.

What I would like to know is what are we optimizing for, it is not that clear to me. JUMPF acts like the ordinary function but "Knowing at validation time that a function will never return control" then that means it is forced to RETURN/REVERT/STOP ( I am concluding this from It is particularly benefitial for small error handling helpers, that end execution with REVERT)
ref this: EIPS-6206

Few comments/questions:

tail-call-optimizations are mostly useful for recursion at least in ordinary CPU's, how much are recursions found inside solidity/vyper code?

Isn't JUMPF just a CALLF that does RETURN, is code/stack validation problem here?

@ekpyron @charles-cooper thank you for discussing this

@pdobacz
Copy link
Member

pdobacz commented Jan 22, 2024

then that means it is forced to RETURN/REVERT/STOP

It is not, it can also RETF, but since JUMPF didn't push to the return stack, it acts as if it RETFed from the JUMPF caller section. The It is particularly beneficial only refers to the case where we JUMPF into a non-returning section.

As an aside note, there was an error I slipped into the megaspec's write-up of relaxed stack validation. That possibly might have caused confusion, apologies if that was the case. See #44 fixing the error.

@gumb0
Copy link
Contributor Author

gumb0 commented Jan 22, 2024

I thought of another argument in favor of having non-returning flag and JUMPF: without them, some code after CALLF to non-returning function is non-reachable in practice, but will be considered reachable by the validation. In other words CALLF STOP sequence can be abused as CALLF <code> STOP and then <code> has to be validated and can lead to some validation errors, although being non-reachable in practice.

In this way non-returning flag really makes EOF Functions spec complete. Without it some functions that are non-returning in practice, but are not declared as such, will cause some quirks like this.

@charles-cooper
Copy link
Contributor

Well, that restricts you to a single function frame, so it would require to inline the entire graph of any corecursion.

I mean this is actually kind of a strong argument for a global code section, right?

@charles-cooper
Copy link
Contributor

how much are recursions found inside solidity/vyper code?

Vyper actually disallows recursion entirely. Even if it allowed recursion, I think generally getting to a recursion depth where tail call optimization is needed is a code smell.

@axic
Copy link
Member

axic commented Jan 30, 2024

On the EOF Implementers Call #31 we have agreed to keep JUMPF and non-returning functions in the specification for now. It can be easily dropped later, should we decide closer to deployment, but re-adding it would be not possible.

We do want to get actual measurements from the Solidity and/or Vyper implementation in the upcoming months to validate this question, and ultimately keep or remove this feature.

Given the above discussion the call felt there's a slight leaning towards JUMPF being useful, but it is to be validated via actual compiler feedback.

@axic axic removed their assignment Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants