-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have one thread per chain on benchmark #3374
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
88a80ad
to
b6acbe6
Compare
linera-service/src/linera/main.rs
Outdated
async move { | ||
context | ||
.run_benchmark( | ||
bps_share, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this in addition to the BPS control task? Can't each chain task go as fast as possible because it will be blocked by the channel anyway if the desired BPS is reached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I do it like that, it means that the number of messages the BPS control task has to process is directly proportional to the BPS value, which is not ideal, specially when we start reaching really high BPS values. If each task has a BPS share it is supposed to achieve, and they only send a message when that share is achieved, then we have at most num_chains
messages per second, always, and that can't really become a bottleneck so easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't imagine that a bunch of ()
messages could become a bottleneck… but I've been proven wrong before. 😅
b6acbe6
to
f15df95
Compare
10e8da9
to
220ab76
Compare
f15df95
to
d3127e0
Compare
220ab76
to
56771fa
Compare
2bc26c1
to
1222830
Compare
56771fa
to
108dccb
Compare
1222830
to
09f92d3
Compare
09f92d3
to
bf4a170
Compare
bf4a170
to
55f7881
Compare
55f7881
to
742de30
Compare
742de30
to
19d9e36
Compare
19d9e36
to
09d9fe2
Compare
60b267d
to
fb96a12
Compare
09d9fe2
to
43ce4d1
Compare
.await?; | ||
let epoch = epoch.expect("default chain should have an epoch"); | ||
let committee = committees | ||
.get(&epoch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you remove
here you don't have to clone
below.
linera-service/src/linera/main.rs
Outdated
async move { | ||
context | ||
.run_benchmark( | ||
bps_share, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't imagine that a bunch of ()
messages could become a bottleneck… but I've been proven wrong before. 😅
If we don't do any mutable operations (don't update the wallet/state/db) is it an honest benchmark? |
let start = Instant::now(); | ||
// Below all block proposals are supposed to succeed without retries, we | ||
// must make sure that all incoming payments have been accepted on-chain | ||
// and that no validator is missing user certificates. | ||
self.process_inboxes_and_force_validator_updates().await; | ||
info!( | ||
"Processed inboxes and forced validator updates in {} ms", | ||
start.elapsed().as_millis() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a reader I'm slightly confused - the method is called prepare_for_benchmark
but here we force the processing of inboxes and create blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what we need to do to prepare for running the benchmark :)
info!( | ||
"Got {} chains in {} ms", | ||
key_pairs.len(), | ||
start.elapsed().as_millis() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not trace!
?
info!( | ||
"Processed inboxes and forced validator updates in {} ms", | ||
start.elapsed().as_millis() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a TRACE
-level log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not production code, it's a benchmarking tool. I want to know how long things are running for without having to turn on trace or debug logs
fungible_application_id, | ||
); | ||
|
||
Ok((chain_clients, epoch, blocks_infos, committee.clone())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is doing at least four things:
- clearing out inboxes, creating blocks
- creating new chains for benchmark
- supplying a test application with some tokens
- calling
make_benchmark_block_info
on a default chain
@@ -1123,7 +1129,7 @@ where | |||
|
|||
/// Broadcasts certified blocks to validators. | |||
#[instrument(level = "trace", skip(committee, delivery))] | |||
async fn communicate_chain_updates( | |||
pub async fn communicate_chain_updates( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add pub
then they become part of the API contract and we no longer can change them.
// the desired BPS, the tasks would continue sending block proposals until the channel's | ||
// buffer is filled, which would cause us to not properly control the BPS rate. | ||
let (sender, receiver) = crossbeam_channel::bounded(0); | ||
let bps_control_task = tokio::spawn(async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like bps_control_task
is spawned on the same logical "level" as tasks that send the blocks. Personally I think it'd make more sense if the "control task" (supervisor) was spawning those tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main task deals with awaiting the different spawned tasks. I don't see what the issue is. The control task controls the BPS, it is not supposed to create other things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the BPS task that is controlling when the job "is done" though. But OK, it doesn't necessarily mean it has to spawn the workers.
let mut start = time::Instant::now(); | ||
while let Ok(()) = receiver.recv() { | ||
recv_count += 1; | ||
if recv_count == num_chains { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to see that the code here is called just before finishing the benchmark. If the BPS control task spawned the "worker tasks" but itself stayed in the "main thread" it'd be more linear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you're describing only happens when we don't specify --bps
. All senders will be closed, and this will exit
let default_chain_client = context.make_chain_client(default_chain_id)?; | ||
let (epoch, committees) = default_chain_client | ||
.epoch_and_committees(default_chain_id) | ||
let (chain_clients, epoch, blocks_infos, committee) = context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it weird that benchmark load (operations
etc) are generated by a client context. Why is it not the benchmark process itself deciding what and how the benchmark should look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I can move the stuff that doesn't need the client context into Benchmark
instead
@deuszx I think I didn't express myself well there. All I meant to say was that we used to alter the wallet, and now we don't. The only resource the different threads could race for is the wallet, but since we don't save the benchmark chains to the wallet anymore, that's no longer a concern. So using the same wallet for multiple threads is feasible, because the wallet is only used to create the chains at the start using its default chain, then the wallet isn't used by the threads anymore. |
I see, thanks. It was my understanding then – I totally forgot that by wallet we mean local wallet of the client who tracks chains. |
43ce4d1
to
f48ee51
Compare
f48ee51
to
fd0602b
Compare
Motivation
Right now on
linera benchmark
we do everything from a single thread. That doesn't scale much.Proposal
Spawn one task per chain, and split the desired BPS across the number of chains. For example, if the desired BPS is 10, and you have 10 chains, each task will need to reach 1 BPS.
Every task just tries to send as many block proposals as fast as possible. We don't control the BPS in the tasks, but we have a BPS control task for that. Once a task has sent the BPS amount of blocks successfully, it sends a message to the BPS control task. It uses a
crossbeam
channel for that, as it allows bounded channels with buffer size of 0.If we have a buffer size of 0, that means that if the BPS control task is sleeping or processing another message, the sender task will be blocked. This is exactly what we want to properly control the BPS behavior. These are faster than
std
channels and comparable speed oftokio
channels, buttokio
channels don't allow a buffer size of 0, and also don't block the sender task as sending is async.The BPS control task will have a timer going. Once it receives
num_chains
messages, that means that the total number of blocks to be sent was reached. If more than a second elapsed, we failed to achieve the desired BPS. Otherwise, we achieved the desired BPS.This all runs sharing the same wallet and
ClientContext
. The only resource the different threads could race for is the wallet, but since we don't save the benchmark chains to the wallet anymore, that's no longer a concern. So using the same wallet for multiple threads is feasible, because the wallet is only used to create the chains at the start using its default chain, then the wallet isn't used by the threads anymore.Shutdown signals also continue to work (all threads are gracefully shutdown), and we continue to close all the chains on exit.
Test Plan
Ran with a few different BPS/TPB variations, seems to work. Shutdown also seems to work correctly.
Release Plan