-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD 0195 - tpu vote using QUIC #195
Open
lijunwangs
wants to merge
6
commits into
solana-foundation:main
Choose a base branch
from
lijunwangs:support_tpu_vote_with_quic
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 5 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
4a98389
Added SIMD for tpu vote using QUIC
lijunwangs f4378d5
updated for md formatting
lijunwangs 30d05b9
updated for md formatting
lijunwangs c78b399
updated for md formatting
lijunwangs 8fe3640
change simd number
lijunwangs 3928ca9
changes some wording and removed implementation specifics
lijunwangs File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
--- | ||
simd: '0195' | ||
title: TPU Vote using QUIC | ||
authors: | ||
- Lijun Wang <[email protected]> | ||
category: Standard | ||
type: Core | ||
status: Review | ||
created: 2024-11-13 | ||
development: | ||
- Anza - WIP | ||
- Firedancer - Not started | ||
--- | ||
|
||
## Summary | ||
|
||
Use QUIC for transporting TPU votes among Solana validators. This requires | ||
supporing receiving QUIC based vote TPU packets on the server side and sending | ||
QUIC-based TPU vote packets on the client side. | ||
|
||
|
||
## Motivation | ||
|
||
As timely vote credits are awarded to validators, they might be incentived to | ||
increase the TPU vote traffic to ensure their votes are received in a timely | ||
manner. This could cause congestions and impact overall TPU vote processing | ||
effectiveness. The concurrent UDP based TPU vote does not have any flow control | ||
mechanism. | ||
|
||
We propose to apply the pattern taken for TPU transaction processing to TPU vote | ||
processing -- by utlizing the flow control mechanism which were developed | ||
including built-in QUIC protocol level flow control, and application-level rate | ||
limiting on connections and packets. | ||
|
||
## Alternatives Considered | ||
|
||
There is no readily-available alternative to QUIC which addresses some of the | ||
requirements such as security (reliability when applying QOS), low latency and | ||
flow control. We could solve the security and flow control with TLS over TCP | ||
the concern is with the latency and head-of-line problems. We could also | ||
customize and build our own rate limiting mechanism based on the UDP directly, | ||
this is non-trivial and cannot solve the security problem without also rely on | ||
some sort of crypto handshaking. | ||
|
||
## New Terminology | ||
|
||
None | ||
|
||
## Detailed Design | ||
|
||
On the server side, the validator will bind to a new QUIC endpoint. Its | ||
corresponding port will be published to the network in the ContactInfo via | ||
Gossip. The client side will use the TPU vote QUIC port published by the server | ||
to connect to the server. | ||
|
||
The TPU vote will be using the same QUIC implementation used by regular | ||
transaction transportation. The client and server both uses their validator's | ||
identity key to sign the certificate which is used to validate the validator's | ||
identity especially on the server side for the purpose of provding QOS based on | ||
the client's stakes by checking the client's Pubkey -- stake weighted QOS. | ||
|
||
Once a QUIC connection is established, the client can send vote transaction | ||
using QUIC UNI streams. In this design, a stream is used to send one single Vote | ||
transaction. After that the stream is closed. | ||
|
||
The server only supports connections from the nodes which has stakes who can | ||
vote. Connections from unstaked nodes are rejected with `disallowed` code. | ||
|
||
The following QOS mechanisms are employed: | ||
|
||
* Connection Rate Limiting from all clients | ||
* Connection Rate Limiting from a particular IpAddress | ||
* Total concurrent connections from all clients -- this is set to 2500 | ||
* Max concurrent connections from a client Pubkey -- this is set to 1 for votes. | ||
* Max concurrent streams per connection -- this is allocated based on the ratio | ||
of the validator's stake over the total stakes of the network. | ||
* Maximum of vote transactions per unit time which is also stake weighted | ||
|
||
When the server processes a stream and its chunk, it may timeout and close the | ||
stream if it does not receive the data in configured timeout window (2s). | ||
|
||
The validator also uses gossip to pull votes from other validator. This proposed | ||
change does not change the transport for that which will remain to be UDP based. | ||
As the gossip based votes are pulled by the validator, the concern with | ||
increased votes traffic is lessened. | ||
|
||
## Impact | ||
|
||
QUIC compared with UDP is connection based. There is an extra overhead to | ||
establish the connections when sending a vote. To minimize this, the client | ||
side can employ connection caching and pre-cache warmer mechanism based on the | ||
leader schedule. | ||
|
||
## Security Considerations | ||
|
||
The are no net new security vulnerability as QUIC TPU transaction has already | ||
been in-place. Similar DoS attack can be targeted against the new QUIC port used | ||
by TPU vote. The connection rate limiting is one tool to fend off such attacks. | ||
|
||
## Backwards Compatibility | ||
|
||
Care need to taken to ensure a smooth transition into using QUIC for TPU votes | ||
from UDP. | ||
|
||
Phase 1. The server side will support both UDP and QUIC for TPU votes. No | ||
clients send TPU votes via QUIC. | ||
|
||
Phase 2. After all staked nodes are upgraded with support of receiving TPU votes | ||
via QUIC, restart the validators with configuration to send TPU votes via QUIC. | ||
|
||
Phase 3. Turn off UDP based TPU votes listener on the server side once all | ||
staked nodes complete phase 2. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firedancer won't enforce any of these limits server-side.
These should more like implementation-specific behavior, so I suggest removing them from the SIMD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed specific numbers