Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sequencer): add transaction status ABCI query #1955

Closed
wants to merge 14 commits into from

Conversation

ethanoroshiba
Copy link
Contributor

@ethanoroshiba ethanoroshiba commented Feb 6, 2025

Summary

Added a transaction status ABCI query to the sequencer and changed the Sequencer Client to poll that instead of CometBFT's tx RPC.

Background

The Sequencer Client's method wait_for_tx_inclusion previously polled CometBFT's tx endpoint, which would return an error no matter whether the transaction was pending, failed, or never submitted in the first place. Because of this lack of distinction, the method would poll indefinitely if a transaction failed for some reason. This is similar to the issue mentioned in #1936.

Changes

  • Add transaction status ABCI query to the sequencer.
  • Moved logic of mempool into new MempoolInner, removing the Arc<RwLock<>> around the different caches.
  • Changed Mempool to be a wrapper for Arc<RwLock<MempoolInner>> so that we don't end up with lock-step issues where, for instance, a transaction status call might fail in between when the transaction is removed from Pending and placed in RemovalCache.
  • Added get_transaction_status method to mempool, returning either Pending, Parked, RemovalCache, or Unknown.
  • Added protos and native types for TransactionStatus and TransactionStatusResponse.
  • Renamed wait_for_tx_inclusion to confirm_tx_inclusion, and changed the logic to continue looping after receiving a Pending or Parked response. If Unknown, it will poll until MAX_WAIT_TIME_UNKNOWN has elapsed, and will then return an error. If it receives RemovalCache, it will try CometBFT's tx RPC for a short time before erroring if there is no response.
  • Changed reliant bits in the Bridge Withdrawer and Sequencer CLI to use the updated method.

Testing

  • Added unit tests for mempool's get_transaction_status method.
  • Added unit tests for Sequencer transaction status ABCI query.
  • Added unit tests for Sequencer Client's confirm_tx_inclusion method.

Changelogs

Changelod updated

Breaking Changelist

  • Not breaking, but worth noting this adds to sequencer's public API.

Related Issues

closes #1949

@github-actions github-actions bot added proto pertaining to the Astria Protobuf spec sequencer pertaining to the astria-sequencer crate labels Feb 6, 2025
@ethanoroshiba ethanoroshiba added bug Something isn't working sequencer-client labels Feb 6, 2025
@ethanoroshiba ethanoroshiba changed the title fix(sequencer-cli): poll TX status to avoid looping indefinitely fix(sequencer-client): poll TX status to avoid looping indefinitely Feb 6, 2025
@ethanoroshiba ethanoroshiba changed the title fix(sequencer-client): poll TX status to avoid looping indefinitely fix(sequencer): add transaction status ABCI query Feb 7, 2025
@ethanoroshiba ethanoroshiba changed the title fix(sequencer): add transaction status ABCI query feat(sequencer): add transaction status ABCI query Feb 7, 2025
@ethanoroshiba ethanoroshiba removed the bug Something isn't working label Feb 7, 2025
Comment on lines 27 to 28
- Change sequencer client to poll transaction status ABCI query so that it doesn't
loop indefinitely [#1955](https://github.com/astriaorg/astria/pull/1955).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if this is the correct place for this? There's no changelog for Sequencer Client.

Comment on lines 10 to 16
enum TransactionStatus {
TRANSACTION_STATUS_UNSPECIFIED = 0;
TRANSACTION_STATUS_PARKED = 1;
TRANSACTION_STATUS_PENDING = 2;
TRANSACTION_STATUS_REMOVAL_CACHE = 3;
TRANSACTION_STATUS_UNKNOWN = 4;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether this is useful. It's represented as an i32 in Rust anyways, and the domain TransactionStatus conversion is done directly from i32. Could be nice for the API, though.

@ethanoroshiba ethanoroshiba marked this pull request as ready for review February 7, 2025 15:34
RemovalReason::NonceStale => write!(f, "transaction nonce is lower than current nonce"),
RemovalReason::LowerNonceInvalidated => write!(
f,
"previous transaction was not executed, to this transaction's nonce has become \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"previous transaction was not executed, to this transaction's nonce has become \
"previous transaction was not executed, so this transaction's nonce has become \

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in cdb152e. Thanks for reviewing!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this got missed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not handled by the looks of it :D

Copy link
Collaborator

@noot noot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, only have a few minor suggestions!

Copy link
Contributor

@Fraser999 Fraser999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, I think we need to make the locking in the mempool more coarse-grained in order to allow the new query to give reliable results. Basically, the four collections comprising the mempool need to be updated atomically.

RemovalReason::NonceStale => write!(f, "transaction nonce is lower than current nonce"),
RemovalReason::LowerNonceInvalidated => write!(
f,
"previous transaction was not executed, to this transaction's nonce has become \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this got missed?

Comment on lines +38 to +61
let height = match storage.latest_snapshot().get_block_height().await {
Ok(height) => height,
Err(err) => {
return response::Query {
code: Code::Err(AbciErrorCode::INTERNAL_ERROR.value()),
info: "failed getting block height".into(),
log: format!("{err:?}"),
..response::Query::default()
};
}
};

let height = match height.try_into() {
Ok(height) => height,
Err(err) => {
return response::Query {
code: Code::Err(AbciErrorCode::INTERNAL_ERROR.value()),
info: "failed converting `u64` block height to type `tendermint::block::Height`"
.into(),
log: format!("{err:?}"),
..response::Query::default()
};
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need to provide this or if we can just use height 0 or 1 (the default).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty non-opinionated on this, but I lean slightly the opposite way just because of the fact that if the height is a part of the message, we should be providing the accurate height for which the response was for, even if unnecessary or unrelated. This might be the wrong mindset, though, so happy to change if you think we should

@ethanoroshiba
Copy link
Contributor Author

After offline discussion, the decision was made to close this in favor of a timeout approach in #2048. This is because we firstly want to move towards gRPC instead of ABCI queries, and secondly since this approach will pretty much only return correct transaction statuses when querying a validator node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proto pertaining to the Astria Protobuf spec sequencer pertaining to the astria-sequencer crate sequencer-client
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug(sequencer-client): client loops forever if transaction execution fails in sequencer
3 participants