Improve beacon processor scheduling #6291

michaelsproul · 2024-08-22T00:03:54Z

Description

Currently Lighthouse uses a priority-based scheduling algorithm to assign work to threads in its BeaconProcessor: https://github.com/sigp/lighthouse/blob/stable/beacon_node/beacon_processor/src/lib.rs

We implemented the BeaconProcessor primarily as a way to avoid overloading the machine with too many jobs running in parallel. Its priority system mostly works fine to ensure that high-priority work gets completed ahead of low-priority work, it sometimes struggles to ever complete low priority work (starvation). If high-prio work keeps coming in, it can indefinitely postpone the processing of low-priority work.

Another issue is that the priorities are currently determined per task and follow a strict ordering. E.g. processing blocks is higher prio than processing attestations. In some cases it's unclear exactly what the task priority ordering should be and we end up having to choose somewhat arbitrarily (e.g. when testing #5481 I obtained some benefit from prioritising status messages higher in the ordering). For API requests this is a particular issue because we have two priorities for requests: P0 which are processed ahead of almost everything, and P1 which are basically bottom of the barrel and won't be run until the processor has some spare cycles.

Version

Lighthouse v5.3.0

Steps to resolve

I would like to abstract the actual scheduling logic out so we can experiment with different algorithms. This will require untangling all of the concerns that are currently mixed into the beacon processor, e.g. backfill rate limiting which gets its own special treatment here:

lighthouse/beacon_node/beacon_processor/src/lib.rs

Lines 863 to 903 in d6ba8c3

    
           Some(InboundEvent::WorkEvent(event)) if enable_backfill_rate_limiting => { 
        
               match QueuedBackfillBatch::try_from(event) { 
        
                   Ok(backfill_batch) => { 
        
                       match work_reprocessing_tx 
        
                           .try_send(ReprocessQueueMessage::BackfillSync(backfill_batch)) 
        
                       { 
        
                           Err(e) => { 
        
                               warn!( 
        
                                   self.log, 
        
                                   "Unable to queue backfill work event. Will try to process now."; 
        
                                   "error" => %e 
        
                               ); 
        
                               match e { 
        
                                   TrySendError::Full(reprocess_queue_message) 
        
                                   | TrySendError::Closed(reprocess_queue_message) => { 
        
                                       match reprocess_queue_message { 
        
                                           ReprocessQueueMessage::BackfillSync( 
        
                                               backfill_batch, 
        
                                           ) => Some(backfill_batch.into()), 
        
                                           other => { 
        
                                               crit!( 
        
                                                   self.log, 
        
                                                   "Unexpected queue message type"; 
        
                                                   "message_type" => other.as_ref() 
        
                                               ); 
        
                                               // This is an unhandled exception, drop the message. 
        
                                               continue; 
        
                                           } 
        
                                       } 
        
                                   } 
        
                               } 
        
                           } 
        
                           Ok(..) => { 
        
                               // backfill work sent to "reprocessing" queue. Process the next event. 
        
                               continue; 
        
                           } 
        
                       } 
        
                   } 
        
                   Err(event) => Some(event), 
        
               } 
        
           }

It might be prudent to start by just refactoring the current scheduling algorithm into a cleaner and more modular form, and then we can go about tweaking it.

I think the properties we desire from a scheduling algorithm are:

Prioritisation based on task type (retain some of what we have now).
Prioritisation based on how long a task has been waiting to run, and/or how close it is to its deadline (see https://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling).
High throughput? Ideally we should be able to max out the CPU if we are running N CPU-bound tasks.
Improved fairness.

Some of these desires are slightly in conflict, so having the ability to try multiple different backends is advantageous.

Something I'm not sure if we can handle is pre-emption (the ability to stop a task midway through execution). For blocking tasks, I don't think this is possible unless we define some sort of yield primitive which returns from the task to the scheduler. This seems like it would be hard to implement. If we could piggyback off Tokio's pre-emption (probably only for async tasks) or the OS's pre-emption (for blocking tasks), maybe we could get some interesting results.

We can start with ideas from:

https://en.wikipedia.org/wiki/Scheduling_(computing)

Additional Info

Regarding throughput, I wrote a sample program to test the current scheduler, and we actually are capable of maxing out the CPU so long as each job runs for more than a few microseconds! I'll push this on a branch somewhere shortly.

The text was updated successfully, but these errors were encountered:

eserilev · 2024-08-29T17:28:48Z

I'd like to take a stab at abstracting away the beacon processor scheduling logic if no one has started on this yet

michaelsproul · 2024-08-30T01:48:58Z

@eserilev That would be great! Happy to pair on this a bit and throw ideas around

michaelsproul added optimization Something to make Lighthouse run more efficiently. HTTP-API labels Aug 22, 2024

michaelsproul mentioned this issue Aug 22, 2024

Increase priority for validator HTTP requests #6292

Merged

michaelsproul assigned eserilev Aug 30, 2024

michaelsproul mentioned this issue Sep 6, 2024

Register processor queue length as histogram #6012

Merged

eserilev mentioned this issue Sep 30, 2024

Modularize beacon processor scheduler #6448

Open

michaelsproul mentioned this issue Oct 18, 2024

Frequent "Attestation queue full" error. #4130

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve beacon processor scheduling #6291

Improve beacon processor scheduling #6291

michaelsproul commented Aug 22, 2024 •

edited

Loading

eserilev commented Aug 29, 2024 •

edited

Loading

michaelsproul commented Aug 30, 2024

Improve beacon processor scheduling #6291

Improve beacon processor scheduling #6291

Comments

michaelsproul commented Aug 22, 2024 • edited Loading

Description

Version

Steps to resolve

Additional Info

eserilev commented Aug 29, 2024 • edited Loading

michaelsproul commented Aug 30, 2024

michaelsproul commented Aug 22, 2024 •

edited

Loading

eserilev commented Aug 29, 2024 •

edited

Loading