You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This diagram demonstrates how we might handle the execution of a configuration, with consideration that the model outputs different batches of leadtimes at different times (hence the clock triggers).
We see three forms of execution for a configuration:
Executing a subset of a configuration (each subset corresponding to a single leadtime batch).
Positives:
Each batch can execute on a separate node if desired.
Executing a subset of the configuration results in smaller queue times.
Not a huge benefit to making dask processing module (plugin) memory footprint aware.
Negatives:
Still a fixed workflow and so we ask for a set amount of resources so to some extent there is wastage.
Any link between different leadtime batches needs explicit handling (saving data from 1 batch and loading it from another).
1.2 Single rose task per configuration.
Executing our configuration as a single static workflow, single task.
Positives:
Ideal for basic workflows.
Handling leadtime batches is straight forward.
Negatives:
This is the least efficient use of resources requested on a platform since there will be times where large parts of the execution waiting for the polling clock trigger. That is, underutilising the resources we requested and wasting money.
Using these schedulers means being stuck with single node execution.
A configuration that would otherwise require more resources than what we are asking for means likely having to make dask memory footprint aware. That reduces the likelihood of dask having to spill to disk (reaching memory threshold limits). Spilling data to disk would be a source for inefficiencies in computation of a configuration.
Dynamic workloads (dask-job-queue)
Positives:
The most efficient form of execution where we can dynamic scale our cluster based on the workfload.
Simplest and most flexible configurations (executing everything within a single execution).
Easily scale to multiple nodes.
Shortest queue times (each worker creation becomes a PBS/SLURM submission).
Utilising dask memory footprint awareness give total flexibility to utilise as little or as much resources as we want.
Negatives:
Exploratory work required to understand how to best feed dask with anticipated memory footprint of processing module execution along with potentially input data memory footprint too (spill to disk capability).
edit diagram; edit mermaid
This diagram demonstrates how we might handle the execution of a configuration, with consideration that the model outputs different batches of leadtimes at different times (hence the clock triggers).
We see three forms of execution for a configuration:
Proposed plan/roadmap
Related issues
The text was updated successfully, but these errors were encountered: