Have one thread per chain on benchmark #3374

ndr-ds · 2025-02-19T23:54:01Z

Motivation

Right now on linera benchmark we do everything from a single thread. That doesn't scale much.

Proposal

Spawn one task per chain, and split the desired BPS across the number of chains. For example, if the desired BPS is 10, and you have 10 chains, each task will need to reach 1 BPS.

Every task just tries to send as many block proposals as fast as possible. We don't control the BPS in the tasks, but we have a BPS control task for that. Once a task has sent the BPS amount of blocks successfully, it sends a message to the BPS control task. It uses a crossbeam channel for that, as it allows bounded channels with buffer size of 0.
If we have a buffer size of 0, that means that if the BPS control task is sleeping or processing another message, the sender task will be blocked. This is exactly what we want to properly control the BPS behavior. These are faster than std channels and comparable speed of tokio channels, but tokio channels don't allow a buffer size of 0, and also don't block the sender task as sending is async.
The BPS control task will have a timer going. Once it receives num_chains messages, that means that the total number of blocks to be sent was reached. If more than a second elapsed, we failed to achieve the desired BPS. Otherwise, we achieved the desired BPS.

This all runs sharing the same wallet and ClientContext. The only resource the different threads could race for is the wallet, but since we don't save the benchmark chains to the wallet anymore, that's no longer a concern. So using the same wallet for multiple threads is feasible, because the wallet is only used to create the chains at the start using its default chain, then the wallet isn't used by the threads anymore.

Shutdown signals also continue to work (all threads are gracefully shutdown), and we continue to close all the chains on exit.

Test Plan

Ran with a few different BPS/TPB variations, seems to work. Shutdown also seems to work correctly.

Release Plan

Nothing to do / These changes follow the usual release cycle.

ndr-ds · 2025-02-19T23:54:27Z

Use linear buckets in some places #3384
Have one thread per chain on benchmark #3374 👈 (View in Graphite)
Adjust dashboards to show 1m metrics #3380
Adjust latency metrics buckets #3379
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

linera-service/src/linera/main.rs

afck · 2025-02-20T10:56:05Z

linera-service/src/linera/main.rs

+                            async move {
+                                context
+                                    .run_benchmark(
+                                        bps_share,


Do we really need this in addition to the BPS control task? Can't each chain task go as fast as possible because it will be blocked by the channel anyway if the desired BPS is reached?

If I do it like that, it means that the number of messages the BPS control task has to process is directly proportional to the BPS value, which is not ideal, specially when we start reaching really high BPS values. If each task has a BPS share it is supposed to achieve, and they only send a message when that share is achieved, then we have at most num_chains messages per second, always, and that can't really become a bottleneck so easily.

I can't imagine that a bunch of () messages could become a bottleneck… but I've been proven wrong before. 😅

linera-client/src/client_context.rs

linera-client/src/benchmark.rs

linera-client/src/client_context.rs

afck · 2025-02-21T10:05:21Z

linera-client/src/client_context.rs

+            .await?;
+        let epoch = epoch.expect("default chain should have an epoch");
+        let committee = committees
+            .get(&epoch)


If you remove here you don't have to clone below.

afck · 2025-02-21T10:09:22Z

linera-service/src/linera/main.rs

+                            async move {
+                                context
+                                    .run_benchmark(
+                                        bps_share,


I can't imagine that a bunch of () messages could become a bottleneck… but I've been proven wrong before. 😅

deuszx · 2025-02-21T11:07:52Z

as we don't do any mutable operations to either since we don't alter the wallet for benchmarks anymore.

If we don't do any mutable operations (don't update the wallet/state/db) is it an honest benchmark?

deuszx · 2025-02-21T11:24:15Z

linera-client/src/client_context.rs

+        let start = Instant::now();
+        // Below all block proposals are supposed to succeed without retries, we
+        // must make sure that all incoming payments have been accepted on-chain
+        // and that no validator is missing user certificates.
+        self.process_inboxes_and_force_validator_updates().await;
+        info!(
+            "Processed inboxes and forced validator updates in {} ms",
+            start.elapsed().as_millis()
+        );


As a reader I'm slightly confused - the method is called prepare_for_benchmark but here we force the processing of inboxes and create blocks?

That's what we need to do to prepare for running the benchmark :)

deuszx · 2025-02-21T11:46:39Z

linera-client/src/client_context.rs

+        info!(
+            "Got {} chains in {} ms",
+            key_pairs.len(),
+            start.elapsed().as_millis()
+        );


Why not trace!?

deuszx · 2025-02-21T11:46:51Z

linera-client/src/client_context.rs

+        info!(
+            "Processed inboxes and forced validator updates in {} ms",
+            start.elapsed().as_millis()
+        );


Looks like a TRACE-level log.

This is not production code, it's a benchmarking tool. I want to know how long things are running for without having to turn on trace or debug logs

deuszx · 2025-02-21T11:49:35Z

linera-client/src/client_context.rs

+            fungible_application_id,
+        );
+
+        Ok((chain_clients, epoch, blocks_infos, committee.clone()))


This method is doing at least four things:

clearing out inboxes, creating blocks

creating new chains for benchmark

supplying a test application with some tokens

calling make_benchmark_block_info on a default chain

deuszx · 2025-02-21T11:50:22Z

linera-core/src/client/mod.rs

@@ -1123,7 +1129,7 @@ where

    /// Broadcasts certified blocks to validators.
    #[instrument(level = "trace", skip(committee, delivery))]
-    async fn communicate_chain_updates(
+    pub async fn communicate_chain_updates(


If we add pub then they become part of the API contract and we no longer can change them.

deuszx · 2025-02-21T11:57:31Z

linera-client/src/benchmark.rs

+        // the desired BPS, the tasks would continue sending block proposals until the channel's
+        // buffer is filled, which would cause us to not properly control the BPS rate.
+        let (sender, receiver) = crossbeam_channel::bounded(0);
+        let bps_control_task = tokio::spawn(async move {


It looks like bps_control_task is spawned on the same logical "level" as tasks that send the blocks. Personally I think it'd make more sense if the "control task" (supervisor) was spawning those tasks.

The main task deals with awaiting the different spawned tasks. I don't see what the issue is. The control task controls the BPS, it is not supposed to create other things.

It's the BPS task that is controlling when the job "is done" though. But OK, it doesn't necessarily mean it has to spawn the workers.

deuszx · 2025-02-21T11:59:06Z

linera-client/src/benchmark.rs

+            let mut start = time::Instant::now();
+            while let Ok(()) = receiver.recv() {
+                recv_count += 1;
+                if recv_count == num_chains {


It's hard to see that the code here is called just before finishing the benchmark. If the BPS control task spawned the "worker tasks" but itself stayed in the "main thread" it'd be more linear.

What you're describing only happens when we don't specify --bps. All senders will be closed, and this will exit

deuszx · 2025-02-21T12:10:08Z

linera-service/src/linera/main.rs

-                let default_chain_client = context.make_chain_client(default_chain_id)?;
-                let (epoch, committees) = default_chain_client
-                    .epoch_and_committees(default_chain_id)
+                let (chain_clients, epoch, blocks_infos, committee) = context


I find it weird that benchmark load (operations etc) are generated by a client context. Why is it not the benchmark process itself deciding what and how the benchmark should look like?

Makes sense, I can move the stuff that doesn't need the client context into Benchmark instead

ndr-ds · 2025-02-21T14:03:49Z

@deuszx I think I didn't express myself well there. All I meant to say was that we used to alter the wallet, and now we don't. The only resource the different threads could race for is the wallet, but since we don't save the benchmark chains to the wallet anymore, that's no longer a concern. So using the same wallet for multiple threads is feasible, because the wallet is only used to create the chains at the start using its default chain, then the wallet isn't used by the threads anymore.

deuszx · 2025-02-21T14:10:15Z

@deuszx I think I didn't express myself well there. All I meant to say was that we used to alter the wallet, and now we don't. The only resource the different threads could race for is the wallet, but since we don't save the benchmark chains to the wallet anymore, that's no longer a concern. So using the same wallet for multiple threads is feasible, because the wallet is only used to create the chains at the start using its default chain, then the wallet isn't used by the threads anymore.

I see, thanks. It was my understanding then – I totally forgot that by wallet we mean local wallet of the client who tracks chains.

ndr-ds · 2025-02-21T15:27:59Z

Merge activity

Feb 21, 10:27 AM EST: A user started a stack merge that includes this pull request via Graphite.
Feb 21, 10:28 AM EST: A user merged this pull request with Graphite.

ndr-ds mentioned this pull request Feb 19, 2025

Stop updating the wallet during linera benchmark #3367

Merged

ndr-ds requested review from afck, christos-h, deuszx, jvff, ma2bd and Twey February 19, 2025 23:54

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 88a80ad to b6acbe6 Compare February 19, 2025 23:55

afck reviewed Feb 20, 2025

View reviewed changes

ndr-ds changed the base branch from 02-18-stop_updating_the_wallet_during_linera_benchmark to graphite-base/3374 February 20, 2025 13:14

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from b6acbe6 to f15df95 Compare February 20, 2025 13:57

ndr-ds force-pushed the graphite-base/3374 branch from 10e8da9 to 220ab76 Compare February 20, 2025 13:57

afck reviewed Feb 20, 2025

View reviewed changes

linera-client/src/client_context.rs Outdated Show resolved Hide resolved

afck reviewed Feb 20, 2025

View reviewed changes

linera-client/src/client_context.rs Outdated Show resolved Hide resolved

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from f15df95 to d3127e0 Compare February 20, 2025 14:21

ndr-ds force-pushed the graphite-base/3374 branch from 220ab76 to 56771fa Compare February 20, 2025 14:31

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch 2 times, most recently from 2bc26c1 to 1222830 Compare February 20, 2025 14:38

This was referenced Feb 20, 2025

Adjust latency metrics buckets #3379

Merged

Adjust dashboards to show 1m metrics #3380

Merged

ndr-ds force-pushed the graphite-base/3374 branch from 56771fa to 108dccb Compare February 20, 2025 15:16

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 1222830 to 09f92d3 Compare February 20, 2025 15:16

ndr-ds changed the base branch from graphite-base/3374 to main February 20, 2025 15:16

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 09f92d3 to bf4a170 Compare February 20, 2025 15:16

afck approved these changes Feb 20, 2025

View reviewed changes

linera-client/src/benchmark.rs Outdated Show resolved Hide resolved

linera-client/src/benchmark.rs Show resolved Hide resolved

linera-client/src/client_context.rs Outdated Show resolved Hide resolved

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from bf4a170 to 55f7881 Compare February 20, 2025 19:48

ndr-ds changed the base branch from main to graphite-base/3374 February 20, 2025 22:47

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 55f7881 to 742de30 Compare February 20, 2025 22:47

ndr-ds changed the base branch from graphite-base/3374 to 02-19-adjust_dashboards_to_show_1m_metrics February 20, 2025 22:47

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 742de30 to 19d9e36 Compare February 20, 2025 22:58

ndr-ds changed the base branch from 02-19-adjust_dashboards_to_show_1m_metrics to graphite-base/3374 February 20, 2025 23:55

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 19d9e36 to 09d9fe2 Compare February 20, 2025 23:55

ndr-ds force-pushed the graphite-base/3374 branch from 60b267d to fb96a12 Compare February 20, 2025 23:55

ndr-ds changed the base branch from graphite-base/3374 to main February 20, 2025 23:56

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 09d9fe2 to 43ce4d1 Compare February 20, 2025 23:56

ndr-ds mentioned this pull request Feb 21, 2025

Use linear buckets in some places #3384

Open

afck approved these changes Feb 21, 2025

View reviewed changes

deuszx reviewed Feb 21, 2025

View reviewed changes

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 43ce4d1 to f48ee51 Compare February 21, 2025 14:52

Have one thread per chain on benchmark

fd0602b

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from f48ee51 to fd0602b Compare February 21, 2025 14:53

ndr-ds merged commit cfefa52 into main Feb 21, 2025
23 checks passed

ndr-ds deleted the 02-19-have_one_thread_per_chain_on_benchmark branch February 21, 2025 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have one thread per chain on benchmark #3374

Have one thread per chain on benchmark #3374

ndr-ds commented Feb 19, 2025 •

edited

Loading

ndr-ds commented Feb 19, 2025 •

edited

Loading

afck Feb 20, 2025

ndr-ds Feb 20, 2025 •

edited

Loading

afck Feb 21, 2025

afck Feb 21, 2025

afck Feb 21, 2025

deuszx commented Feb 21, 2025

deuszx Feb 21, 2025

ndr-ds Feb 21, 2025

deuszx Feb 21, 2025

deuszx Feb 21, 2025

ndr-ds Feb 21, 2025

deuszx Feb 21, 2025

deuszx Feb 21, 2025

deuszx Feb 21, 2025

ndr-ds Feb 21, 2025

deuszx Feb 21, 2025

deuszx Feb 21, 2025

ndr-ds Feb 21, 2025

deuszx Feb 21, 2025

ndr-ds Feb 21, 2025

ndr-ds commented Feb 21, 2025

deuszx commented Feb 21, 2025

ndr-ds commented Feb 21, 2025 •

edited

Loading

Have one thread per chain on benchmark #3374

Have one thread per chain on benchmark #3374

Conversation

ndr-ds commented Feb 19, 2025 • edited Loading

Motivation

Proposal

Test Plan

Release Plan

ndr-ds commented Feb 19, 2025 • edited Loading

Choose a reason for hiding this comment

ndr-ds Feb 20, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deuszx commented Feb 21, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ndr-ds commented Feb 21, 2025

deuszx commented Feb 21, 2025

ndr-ds commented Feb 21, 2025 • edited Loading

Merge activity

ndr-ds commented Feb 19, 2025 •

edited

Loading

ndr-ds commented Feb 19, 2025 •

edited

Loading

ndr-ds Feb 20, 2025 •

edited

Loading

ndr-ds commented Feb 21, 2025 •

edited

Loading