-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark storage arg to read fraction of kv pairs. #1233
Benchmark storage arg to read fraction of kv pairs. #1233
Conversation
…github.com:adelarja/polkadot-sdk into benchmark-storage-arg-to-read-fraction-of-kv-pairs
The CI pipeline was cancelled due to failure one of the required jobs. |
|
||
info!("Reading {} keys of {} keys", number_of_keys, keys.len()); | ||
let keys = &keys[0..number_of_keys]; | ||
let mut child_nodes = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont know if we should also apply this to the child nodes. Otherwise there could be cases where one large top key pulls in a huge child tree.
WDYT @crystalin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should also do it for the child_nodes
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Capping the vec size to db_fraction can at most double the size of potential key reads (root + child) but I presume that's still a major improvement from what it is right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -44,12 +44,15 @@ impl StorageCmd { | |||
let mut keys: Vec<_> = client.storage_keys(best_hash, None, None)?.collect(); | |||
let (mut rng, _) = new_rng(None); | |||
keys.shuffle(&mut rng); | |||
let number_of_keys = (keys.len() * self.params.db_fraction as usize) / 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should probably error if the result is zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do something like:
let number_of_keys = if number_of_keys != 0 {number_of_keys} else {keys.len()};
If we have number_of_keys = 0
, we use all the keys by default. Another option would be to set the minimum limit to 1. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But all keys could also be empty afaik the check makes sense either way. Just returning an error is no problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the number_of_keys
is 0
that can mean either (or both) keys.len()
or self.params.db_fraction
was 0
, but self.params.db_fraction
is restricted via clap arguments 1..=100
so it cannot be 0
, in which case keys.iter().take(0)
is equivalent to keys.iter()
thereby keeping the old behavior in tact. I believe there's no special case to be handled here unless we're worried someone/something overriding db_fraction
to 0
in which case that should be the thing being checked here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What i am saying is that the old behaviour was probably also wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay approving now. You can still add the !=0 check if you want, but its an edge-case anyway so i dont want to block it for that.
I'd like to test it before getting it merged. One change that I could think of is to have the fraction be a per 1000th instead of percent. I'll verify it today |
Yes could be good for very large states. We can use a float percent or unit |
I tried on Moonbeam:
I'm not sure what is limiting it but it doesn't seem to use much CPU (2%). IOPS is at 1500 (6MB/s disk read) but nothing appears in the logs |
Okay so its not working as intended. Probably because we first collect all keys with |
Do we still want this pr @ggwpez? |
It did not actually solve the issue, since the slow time comes from iterating through all keys to pick a random sub-set. |
The purpose of this pull request is to introduce an argument to the benchmarking storage CLI, enabling the selection of the percentage of database keys for benchmark execution.
The argument is named
db-fraction
and accepts values between 1 and 100 (1% to 100%).Example usage (using 10% of keys):
Closes #146.