Skip to content

Commit

Permalink
[Bridge] Retry on finalized transaction not observed (MystenLabs#19882)
Browse files Browse the repository at this point in the history
## Description 

In some cases, the client observes a finalized transaction before most
bridge authorities. In such cases, today the code will return an error
without retries, causing this validator to be skipped in terms of
signature aggregation. If bridge validators are using slow ethereum
fullnode providers, for example, this can be a large amount of the
committee, resulting in failed signature aggregation attempts in some
cases (due to not achieving quorum), or high gas costs in others (due to
not getting fewer higher staked validator signatures).

The solution here is to add retry logic in the map function.

## Test plan 

Will run on a testnet node and see what happens.

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:
  • Loading branch information
williampsmith authored Oct 18, 2024
1 parent 799591b commit 1f30f8c
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 10 deletions.
32 changes: 27 additions & 5 deletions crates/sui-bridge/src/client/bridge_authority_aggregator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,9 @@ use sui_types::committee::StakeUnit;
use sui_types::committee::TOTAL_VOTING_POWER;
use tracing::{error, info, warn};

const TOTAL_TIMEOUT_MS: u64 = 5000;
const PREFETCH_TIMEOUT_MS: u64 = 1500;
const TOTAL_TIMEOUT_MS: u64 = 5_000;
const PREFETCH_TIMEOUT_MS: u64 = 1_500;
const RETRY_INTERVAL_MS: u64 = 500;

pub struct BridgeAuthorityAggregator {
pub committee: Arc<BridgeCommittee>,
Expand Down Expand Up @@ -249,8 +250,29 @@ async fn request_sign_bridge_action_into_certification(
clients,
preference,
state,
|_name, client| {
Box::pin(async move { client.request_sign_bridge_action(action.clone()).await })
|name, client| {
Box::pin(async move {
let start = std::time::Instant::now();
let timeout = Duration::from_millis(TOTAL_TIMEOUT_MS);
let retry_interval = Duration::from_millis(RETRY_INTERVAL_MS);
while start.elapsed() < timeout {
match client.request_sign_bridge_action(action.clone()).await {
Ok(result) => {
return Ok(result);
}
// retryable errors
Err(BridgeError::TxNotFinalized) => {
warn!("Bridge authority {} observing transaction not yet finalized, retrying in {:?}", name.concise(), retry_interval);
tokio::time::sleep(retry_interval).await;
}
// non-retryable errors
Err(e) => {
return Err(e);
}
}
}
Err(BridgeError::TransientProviderError(format!("Bridge authority {} did not observe finalized transaction after {:?}", name.concise(), timeout)))
})
},
|mut state, name, stake, result| {
Box::pin(async move {
Expand Down Expand Up @@ -293,7 +315,7 @@ async fn request_sign_bridge_action_into_certification(
}
})
},
Duration::from_secs(TOTAL_TIMEOUT_MS),
Duration::from_millis(TOTAL_TIMEOUT_MS),
)
.await
.map_err(|state| {
Expand Down
15 changes: 10 additions & 5 deletions crates/sui-bridge/src/client/bridge_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -207,11 +207,16 @@ impl BridgeClient {
.await?;
if !resp.status().is_success() {
let error_status = format!("{:?}", resp.error_for_status_ref());
return Err(BridgeError::RestAPIError(format!(
"request_sign_bridge_action failed with status {:?}: {:?}",
error_status,
resp.text().await?
)));
let resp_text = resp.text().await?;
return match resp_text {
text if text.contains(&format!("{:?}", BridgeError::TxNotFinalized)) => {
Err(BridgeError::TxNotFinalized)
}
_ => Err(BridgeError::RestAPIError(format!(
"request_sign_bridge_action failed with status {:?}: {:?}",
error_status, resp_text
))),
};
}
let signed_bridge_action = resp.json().await?;
verify_signed_bridge_action(
Expand Down

0 comments on commit 1f30f8c

Please sign in to comment.