From a32d6358b7aebfcc1c1d333ccc2faa53c350881b Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Thu, 5 Sep 2024 16:40:01 +0300 Subject: [PATCH] Update batch_job.md --- materials/batch_job.md | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/materials/batch_job.md b/materials/batch_job.md index 7fa6f7f6..87478f7d 100644 --- a/materials/batch_job.md +++ b/materials/batch_job.md @@ -43,20 +43,17 @@ When we submit a batch job script, the job is not started directly, but is sent :::{admonition} How many resources to request? :class: seealso -* You can use your workstation / laptop as a base measuring stick: If the code runs on your machine, as a first guess you can reserve the same amount of CPUs & memory as your machine has. +* If you have run the code runs on some other machine (your laptop?), as a first guess you can reserve the same amount of CPUs and memory as your machine has. +* You can also check more closely what resources are used with `top` on Mac and Linux or `task manager` on Windows when running on your machine. +* If your program does the same or similar thing more than once, you can estimate that the total run time is number of steps times time taken by each step. +* The first resource reservation is often a guess, that can be later adjusted. * Before reserving multiple CPUs, check if your code can make use them. * Before reserving multiple nodes, check if your code can make use them. Most GIS tools can not. -* You can also check more closely what resources are used with `top` on Mac and Linux or `task manager` on Windows when running on your machine -* Similarly for running time: if you have run it on your machine, you should reserve similar time in the supercomputer. -* If your program does the same thing more than once, you can estimate that the `total run time is number of steps times time taken by each step`. -* Likewise, if your program runs multiple parameters, the `total time needed is number of parameters times the time needed to run the program with one/some parameters`. -* You can also run a smaller version of the problem and try to estimate how the program will scale when you make the problem bigger. -* You should always monitor jobs to find out what were the actual resources you requested. * When you double the number of cores, the job should run at least 1.5x faster. * Some tools run both on CPU and GPU, if unsure which to use, a good rule of thumb is to compare the billing unit (BU) usage and select the one using less. A GPU uses 60 times more billing units than a single CPU core. +* You should always monitor jobs to find out what were the actual resources you requested. - -Adapted from [Aalto Scientific Computing](https://scicomp.aalto.fi/triton/usage/program-size/) +Partly adapted from [Aalto Scientific Computing](https://scicomp.aalto.fi/triton/usage/program-size/) ::: ## Partitions