Submit multiple jobs to a single node, rather than scheduling one job per node #2616
Replies: 2 comments 1 reply
-
A few clarifications:
With the right combination of Last thing, you could also consider using a different machine shape that matches your job better using the |
Beta Was this translation helpful? Give feedback.
-
That submission is sufficiently more complex, so I wouldn't be surprised if there was something in there that was killing the node once the first job finished. I did run a simple reproduction on n2d-standard-8 (4 physical cores):
All 3 jobs fit on the same machine and the machine stayed up until all 3 jobs were complete. I will admit that this was on a Slurm GCP v6 cluster, but I would not expect a difference in the exclusive behavior. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I am running snakemake on HPC toolkit, using the default deployment (https://github.com/GoogleCloudPlatform/hpc-toolkit/blob/main/examples/README.md#hpc-slurmyaml-).
When I submit jobs, it submits one job to each node, and using the compute nodes, my 1-2 CPU jobs are spinning up and spinning down an n2-standard-60, which is very slow. I would like to submit multiple jobs to a single compute node until it is maxed out, then spin up a new compute node. This is how our on prem slurm system works and I would like to replicate it in GCP. Any idea what I need to change in the config to enable this?
Thanks,
Kyle
Beta Was this translation helpful? Give feedback.
All reactions