Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge bulk lookup proofs #163

Merged
merged 8 commits into from
Mar 9, 2022
Merged

Merge bulk lookup proofs #163

merged 8 commits into from
Mar 9, 2022

Conversation

eozturk1
Copy link
Contributor

@eozturk1 eozturk1 commented Mar 7, 2022

This PR adds support for bulk lookup proofs (See Issue #103).

First, for each lookup proof requested, necessary labels are calculated. Then, the nodes corresponding to their self- and sibling-prefixes are preloaded before the lookup operations (See the comments above the function build_lookup_prefixes_set for my understanding why this is the case).

To test it, I added a new feature to the MySQL implementation where each storage call is tracked with its self-(read/write) and data-type (e.g., HistoryTreeNode). Here is its output with the following parameters:

num_users: 50
num_epochs: 5
num_lookups: 100

running 1 test
Metrics after publish(es).
MySQL writes: 3279, MySQL reads: 6, Time read: 0.071272952 s, Time write: 1.13520928 s
        Tree size: 899
        Node state count: 1386
        Value state count: 250
Read call stats: [("get_direct:~Azks", 1), ("get_user_state_versions~", 5)]
Write call stats: [("internal_batch_set~", 20), ("internal_set~", 3)]

- Individual 100 lookups took 3103ms.
Metrics after individual lookups:
MySQL writes: 0, MySQL reads: 969, Time read: 2.041950956 s, Time write: 0 s
        Tree size: 899
        Node state count: 1386
        Value state count: 250
- Read call stats: [("get_direct:~Azks", 1), ("get_direct:~HistoryNodeState", 459), ("get_direct:~HistoryTreeNode", 459), ("get_user_state~", 50)]
Write call stats: []

+ Bulk 100 lookups took 1732ms.
Metrics after lookup proofs: 
MySQL writes: 0, MySQL reads: 75, Time read: 0.59272097 s, Time write: 0 s
        Tree size: 899
        Node state count: 1386
        Value state count: 250
+ Read call stats: [("batch_get~HistoryNodeState", 12), ("batch_get~HistoryTreeNode", 12), ("get_direct:~Azks", 1), ("get_user_state~", 50)]
Write call stats: []

Note the reduction in completion time (about 40% but may vary depending on the parameters) and get_direct calls replaced with batch_gets in bulk lookups.

This code was a bit tricky to test since MySQL metrics are reset when printed and nodes in cache may change the resulting numbers. In addition, the users looked up might result in different time measurements. I tried and accounted for all that but if you notice any issues, please let me know!

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 7, 2022
Copy link
Contributor

@slawlor slawlor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's close but we need to fix the mysql metrics gathering and I think there's more optimizations that are possible for batch lookups. But the second part could be pushed to a future PR.

akd/src/directory.rs Outdated Show resolved Hide resolved
akd/src/directory.rs Show resolved Hide resolved
akd/src/directory.rs Show resolved Hide resolved
let mut lookup_proofs = Vec::new();
for i in 0..unames.len() {
lookup_proofs.push(
self.lookup_with_info::<H>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the info? I think if we've done the proper BFS loading, we should only need to batch-get the value states in order to load them into the cache, then we can do the regular proof generation operations since we'll only be accessing from local memory

Copy link
Contributor Author

@eozturk1 eozturk1 Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to eliminate duplicate get_user_state queries. One in identifying which labels are needed and one in (previously) lookup for the same purpose. This way pre-loaded info is passed to the lookup function, so no re-loads.

Looking at the code for MySQL's get_user_state, are we missing looking up the queried state in cache?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, yes we've excluded get_user_state from the caching logic because of the filter parameters. How would you know a cached value matches "<= epoch" or some weird filter without going to the backing store? It sounds like we might need a new query at the storage layer to retrieve the max versions for a collection of users which is <= a given epoch. (If I'm remembering right). We could move forward with this however and simply open an issue for it, but we'll want to do it properly for batch proof generation. Here's we will hit order N queries just to gather the user states, while not awful, still not ideal when we can do a better query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see! Tracked in #169. I can take a stab at #166, #167 and #169 in a new PR. Does that sound good?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great, let's merge this in!

akd/src/node_state.rs Show resolved Hide resolved
akd_mysql/src/mysql.rs Show resolved Hide resolved
@eozturk1
Copy link
Contributor Author

eozturk1 commented Mar 9, 2022

I'll work on #166, #167 and #169 in a different PR. Is there anything I should update in this PR to proceed with merging?

@eozturk1 eozturk1 merged commit 01e3e00 into facebook:main Mar 9, 2022
@eozturk1 eozturk1 deleted the merge-bulk-lookup-proofs branch March 10, 2022 23:50
@eozturk1 eozturk1 mentioned this pull request Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants