Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indexer: convert IndexerRead to using async connections #19213

Merged
merged 27 commits into from
Sep 5, 2024

Conversation

bmwill
Copy link
Contributor

@bmwill bmwill commented Sep 4, 2024

Description

Describe the changes or additions included in this PR.

Test plan

How did you test the new or updated feature?


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Sep 4, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 5, 2024 3:20pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Sep 5, 2024 3:20pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Sep 5, 2024 3:20pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Sep 5, 2024 3:20pm

@bmwill bmwill marked this pull request as ready for review September 5, 2024 15:19
@bmwill bmwill changed the title indexer: convert to using async connections indexer: convert IndexerRead to using async connections Sep 5, 2024
@bmwill bmwill enabled auto-merge (rebase) September 5, 2024 15:42
Copy link
Contributor

@wlmyng wlmyng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc, by using AsyncConnection, use diesel_async::RunQueryDsl, we can get rid of a lot of the existing code around handling blocking connections? pretty rad

to play well with the ObjectStore trait though we need to make the connection blocking

&self,
filter: EventFilter,
cursor: Option<EventID>,
limit: usize,
descending_order: bool,
) -> IndexerResult<Vec<SuiEvent>> {
let pool = self.get_pool();
use diesel_async::RunQueryDsl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how come we cant just import this once at the top of the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the two traits conflict with each other so i did this to make it easier to convert things

Comment on lines +315 to +320
let object_store = ConnectionAsObjectStore::from_pool(&self.pool)
.await
.map_err(|e| IndexerError::PgPoolConnectionError(e.to_string()))?;

let system_state = tokio::task::spawn_blocking(move || {
sui_types::sui_system_state::get_sui_system_state(&object_store)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use self.get_pool (the blocking pool) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, we're trying to eliminate the blocking pool entirely


let stored_epoch = epochs::table
.into_boxed()
.pipe(|query| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about pipe for keeping the method chaining style, pretty dope

Comment on lines +136 to 139
pub fn get_pool(&self) -> BlockingConnectionPool {
self.blocking_pool.clone()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this actually still used anywhere? seems like we just do self.pool.get().await now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its used in graphql to get connections, in a follow up PR i'll work to convert graphql codebase which will then let us remove the blocking pool entirely

@bmwill bmwill requested a review from wlmyng September 5, 2024 17:37
@bmwill
Copy link
Contributor Author

bmwill commented Sep 5, 2024

to play well with the ObjectStore trait though we need to make the connection blocking

Yeah so to handle that case we take the async connection and wrap it in a sync layer (that uses the async connection under the hood) which lets us still remove a dependency on the sync connection pool

Copy link
Contributor

@wlmyng wlmyng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@bmwill bmwill merged commit 28feff4 into MystenLabs:main Sep 5, 2024
68 of 78 checks passed
@bmwill bmwill deleted the indexer-async branch September 5, 2024 17:46
Copy link
Contributor

@gegaowp gegaowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great move overall, with a comment regarding singleton object read perf.

let mut connection = self.pool.get().await?;

let object = match objects::table
.filter(objects::object_type.eq(type_.to_canonical_string(/* with_prefix */ true)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object_type does not have its own index, this query will be a full-scan and very slow, shall we keep the logic of get_single_obj_id_from_package_publish & package_obj_type_cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graphql does have the ability to do a singleton lookup based on type. I was under the impression we did have sufficient indexes for this because this is essentially how graphql does this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that GraphQL makes this query on objects_history which does have an index on just type:

CREATE INDEX objects_history_type ON objects_history (checkpoint_sequence_number, object_type);

I think @gegaowp is right that today, this query will result in a full table scan. objects does have an index that you can leverage:

CREATE INDEX objects_package_module_name_full_type ON objects (object_type_package, object_type_module, object_type_name, object_type);

But it needs additional filters -- I'll put up a PR for this. It was created this way so that the same index could be used to support filtering by just the type's package, module, name and then the full type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #19247

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants