-
Notifications
You must be signed in to change notification settings - Fork 42
feat(interop): L2toL2CrossDomainMessenger autorelayer #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- `Claim`: Emitted by `L2ToL2CrossDomainGasTank` when a refund is claimed. | ||
- `Deposit`: Emitted by `L2ToL2CrossDomainGasTank` when funds are deposited. | ||
|
||
### Nonce management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tremarkley does this mean that we can only scale with the number of EOAs that we have funded? So theoretically only one transaction per block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there will be multiple EOAs per chain, operating in parallel. Each EOA can at most do one transaction per block, since each EOA waits on tx receipt before processing the next message, therefore, the number of txs per block is equal to the number of EOAs we have on that chain. For example, if we have 10 EOAs on OP mainnet, that means we could have up to 10 relayer txs per block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we could go for dynamic concurrency - use a Prometheus-driven autoscaler that spawns additional workers (I assume one EOA per worker) when the queue age exceeds some metric, maybe 2x block time. Then we don't have to have a fixed high number of workers per chain but can still avoid the EOA bottleneck.
If its not that big a deal to just fund enough wallets on each chain then this may be overkill. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh nice, this could help reduce complexity of keeping wallets funded. I'll explore this a bit, but I do think for now the drippie approach is pretty simple and if we are having to scale to an amount of EOAs that is difficult to manage even with drippie, then we should explore a more dynamic based approach.
### Gas Price | ||
**_Open question: do we need to allow a max gas price to be specified per message?_** | ||
|
||
**_Open question: are we okay with the relayer paying the priority fee and not being refunded for it?_** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we include this in the API design for ad hoc requests? Feels like we would want to set to high such that transactions always get filled immediately
### Testing | ||
Since the relayer is a production critical service, it is important that we have a high degree of confidence that the changes pushed to it do not break the service. As a part of this effort we will need to run integration tests that spin up the relayer against [kurtosis interop devnets](https://github.com/ethereum-optimism/optimism/tree/develop/kurtosis-devnet). | ||
|
||
## Failure Mode Analysis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of this is covered above, like how stuck transactions are handled. But just wanted to share my list of what I think should be covered in FMA in case its scheduled while I'm out:
- redis outage
- postgres lag
- stuck transactions
- funded wallet is empty
- chain reorg (we discussed Ponder has a nice way to handle this earlier - we can probably use its re-org handling)
- chain halt
- making sure keys are stored properly and we're rotating them
#### Database Resource Usage | ||
The most significant source of resource usage for the database will be the indexing of all `SentMessage` and `RelayMessage` events from the `L2toL2CrossDomainMessenger`. Initially we will store all of these events without a retention period, however, as resource usage increases we can make optimizations to remove messages that have expired or are no longer needed because they have been relayed. | ||
|
||
### Transaction submission |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have Ponder stream events directly into a Redis/Bullmq queue via its event hooks instead of persisting everything first to Postgres and then polling? I think that might cut relay latency and reduce write load on the DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that. We should add event hooks to ponder to directly put new events on the queue. If a balance in the fee vault, can cover that message, then the message can immediately be relayed, which will provide optimal latency. Persisting is still useful in the case where the message cant be relayed immediately at emission because fee vault balance is too low, but then fee vault receives a deposit later and then the message can be relayed.
Closes ethereum-optimism/ecosystem#750