Skip to content

feat(interop): L2toL2CrossDomainMessenger autorelayer #274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

tremarkley
Copy link
Contributor

- `Claim`: Emitted by `L2ToL2CrossDomainGasTank` when a refund is claimed.
- `Deposit`: Emitted by `L2ToL2CrossDomainGasTank` when funds are deposited.

### Nonce management
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tremarkley does this mean that we can only scale with the number of EOAs that we have funded? So theoretically only one transaction per block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there will be multiple EOAs per chain, operating in parallel. Each EOA can at most do one transaction per block, since each EOA waits on tx receipt before processing the next message, therefore, the number of txs per block is equal to the number of EOAs we have on that chain. For example, if we have 10 EOAs on OP mainnet, that means we could have up to 10 relayer txs per block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we could go for dynamic concurrency - use a Prometheus-driven autoscaler that spawns additional workers (I assume one EOA per worker) when the queue age exceeds some metric, maybe 2x block time. Then we don't have to have a fixed high number of workers per chain but can still avoid the EOA bottleneck.

If its not that big a deal to just fund enough wallets on each chain then this may be overkill. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nice, this could help reduce complexity of keeping wallets funded. I'll explore this a bit, but I do think for now the drippie approach is pretty simple and if we are having to scale to an amount of EOAs that is difficult to manage even with drippie, then we should explore a more dynamic based approach.

### Gas Price
**_Open question: do we need to allow a max gas price to be specified per message?_**

**_Open question: are we okay with the relayer paying the priority fee and not being refunded for it?_**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include this in the API design for ad hoc requests? Feels like we would want to set to high such that transactions always get filled immediately

@tremarkley tremarkley changed the title L2toL2CrossDomainMessenger Autorelayer: Design Doc feat(interop): L2toL2CrossDomainMessenger autorelayer Apr 28, 2025
### Testing
Since the relayer is a production critical service, it is important that we have a high degree of confidence that the changes pushed to it do not break the service. As a part of this effort we will need to run integration tests that spin up the relayer against [kurtosis interop devnets](https://github.com/ethereum-optimism/optimism/tree/develop/kurtosis-devnet).

## Failure Mode Analysis
Copy link
Contributor

@fainashalts fainashalts May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this is covered above, like how stuck transactions are handled. But just wanted to share my list of what I think should be covered in FMA in case its scheduled while I'm out:

  • redis outage
  • postgres lag
  • stuck transactions
  • funded wallet is empty
  • chain reorg (we discussed Ponder has a nice way to handle this earlier - we can probably use its re-org handling)
  • chain halt
  • making sure keys are stored properly and we're rotating them

#### Database Resource Usage
The most significant source of resource usage for the database will be the indexing of all `SentMessage` and `RelayMessage` events from the `L2toL2CrossDomainMessenger`. Initially we will store all of these events without a retention period, however, as resource usage increases we can make optimizations to remove messages that have expired or are no longer needed because they have been relayed.

### Transaction submission
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have Ponder stream events directly into a Redis/Bullmq queue via its event hooks instead of persisting everything first to Postgres and then polling? I think that might cut relay latency and reduce write load on the DB.

Copy link
Contributor Author

@tremarkley tremarkley May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that. We should add event hooks to ponder to directly put new events on the queue. If a balance in the fee vault, can cover that message, then the message can immediately be relayed, which will provide optimal latency. Persisting is still useful in the case where the message cant be relayed immediately at emission because fee vault balance is too low, but then fee vault receives a deposit later and then the message can be relayed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relayer design doc
3 participants