Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark storage arg to read fraction of kv pairs. #1233

Closed
1 change: 1 addition & 0 deletions substrate/bin/node/cli/tests/benchmark_storage_works.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ fn benchmark_storage(db: &str, base_path: &Path) -> ExitStatus {
.args(["--warmups", "0"])
.args(["--add", "100", "--mul", "1.2", "--metric", "p75"])
.arg("--include-child-trees")
.args(["--db-fraction", "10"])
.status()
.unwrap()
}
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ write: 71_347 * constants::WEIGHT_REF_TIME_PER_NANOS,
- `--json-read-path` Write the raw 'read' results to this file or directory.
- `--json-write-path` Write the raw 'write' results to this file or directory.
- [`--header`](../shared/README.md#arguments)
- `--db-fraction` Indicate the percentage of database keys to utilize, ranging from `1` to `100` (default 100).

License: Apache-2.0

Expand Down
4 changes: 4 additions & 0 deletions substrate/utils/frame/benchmarking-cli/src/storage/cmd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,10 @@ pub struct StorageParams {
/// Include child trees in benchmark.
#[arg(long)]
pub include_child_trees: bool,

/// Access only a percentage of KV pairs.
#[arg(long, value_parser = clap::value_parser!(u8).range(1..=100), default_value_t = 100)]
ggwpez marked this conversation as resolved.
Show resolved Hide resolved
pub db_fraction: u8,
}

impl StorageCmd {
Expand Down
16 changes: 12 additions & 4 deletions substrate/utils/frame/benchmarking-cli/src/storage/read.rs
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,15 @@ impl StorageCmd {
let mut keys: Vec<_> = client.storage_keys(best_hash, None, None)?.collect();
let (mut rng, _) = new_rng(None);
keys.shuffle(&mut rng);
let number_of_keys = (keys.len() * self.params.db_fraction as usize) / 100;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably error if the result is zero.

Copy link
Contributor Author

@adelarja adelarja Sep 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do something like:

let number_of_keys = if number_of_keys != 0 {number_of_keys} else {keys.len()};

If we have number_of_keys = 0, we use all the keys by default. Another option would be to set the minimum limit to 1. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But all keys could also be empty afaik the check makes sense either way. Just returning an error is no problem.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the number_of_keys is 0 that can mean either (or both) keys.len() or self.params.db_fraction was 0 , but self.params.db_fraction is restricted via clap arguments 1..=100 so it cannot be 0, in which case keys.iter().take(0) is equivalent to keys.iter() thereby keeping the old behavior in tact. I believe there's no special case to be handled here unless we're worried someone/something overriding db_fraction to 0 in which case that should be the thing being checked here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What i am saying is that the old behaviour was probably also wrong.

let number_of_keys = if number_of_keys != 0 { number_of_keys } else { 1 };

info!("Reading {} keys of {} keys", number_of_keys, keys.len());
let mut child_nodes = Vec::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know if we should also apply this to the child nodes. Otherwise there could be cases where one large top key pulls in a huge child tree.

WDYT @crystalin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should also do it for the child_nodes below.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Capping the vec size to db_fraction can at most double the size of potential key reads (root + child) but I presume that's still a major improvement from what it is right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nbaztec @ggwpez I applied it to the child nodes here and here. WDYT?


// Interesting part here:
// Read all the keys in the database and measure the time it takes to access each.
info!("Reading {} keys", keys.len());
for key in keys.as_slice() {
for key in keys.iter().take(number_of_keys) {
match (self.params.include_child_trees, self.is_child_key(key.clone().0)) {
(true, Some(info)) => {
// child tree key
Expand All @@ -71,9 +74,14 @@ impl StorageCmd {

if self.params.include_child_trees {
child_nodes.shuffle(&mut rng);
let number_of_child_keys = (child_nodes.len() * self.params.db_fraction as usize) / 100;
info!(
"Reading {} child keys of {} child keys",
number_of_child_keys,
child_nodes.len()
);

info!("Reading {} child keys", child_nodes.len());
for (key, info) in child_nodes.as_slice() {
for (key, info) in child_nodes.iter().take(number_of_child_keys) {
let start = Instant::now();
let v = client
.child_storage(best_hash, info, key)
Expand Down
17 changes: 12 additions & 5 deletions substrate/utils/frame/benchmarking-cli/src/storage/write.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,13 +64,15 @@ impl StorageCmd {
let mut kvs: Vec<_> = trie.pairs(Default::default())?.collect();
let (mut rng, _) = new_rng(None);
kvs.shuffle(&mut rng);
info!("Writing {} keys", kvs.len());
let number_of_keys = (kvs.len() * self.params.db_fraction as usize) / 100;
let number_of_keys = if number_of_keys != 0 { number_of_keys } else { 1 };

info!("Writing {} keys of {} keys", number_of_keys, kvs.len());
let mut child_nodes = Vec::new();

// Generate all random values first; Make sure there are no collisions with existing
// db entries, so we can rollback all additions without corrupting existing entries.
for key_value in kvs {
for key_value in kvs.into_iter().take(number_of_keys) {
let (k, original_v) = key_value?;
match (self.params.include_child_trees, self.is_child_key(k.to_vec())) {
(true, Some(info)) => {
Expand Down Expand Up @@ -118,9 +120,14 @@ impl StorageCmd {

if self.params.include_child_trees {
child_nodes.shuffle(&mut rng);
info!("Writing {} child keys", child_nodes.len());

for (key, info) in child_nodes {
let number_of_child_keys = (child_nodes.len() * self.params.db_fraction as usize) / 100;
info!(
"Writing {} child keys of {} child keys.",
number_of_child_keys,
child_nodes.len()
);

for (key, info) in child_nodes.into_iter().take(number_of_child_keys) {
if let Some(original_v) = client
.child_storage(best_hash, &info.clone(), &key)
.expect("Checked above to exist")
Expand Down
Loading