Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intro to slurm #183

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 50 additions & 44 deletions Intro_to_slurm.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,58 +6,69 @@ Slurm is a resource manager and job scheduler designed for scheduling and alloca
`sinfo` provides information regarding resources that are available from server.
Example :

user@taos:~$ sinfo
[rdscher@hopper ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
normal* up 2-00:00:00 2 mix taos[01,09]
normal* up 2-00:00:00 1 alloc taos02
normal* up 2-00:00:00 6 idle taos[03-08]
general* up 2-00:00:00 2 alloc hopper[002-003]
general* up 2-00:00:00 8 idle hopper[001,004-010]
debug up 4:00:00 2 idle hopper[011-012]
condo up 2-00:00:00 4 idle* hopper[064-067]
condo up 2-00:00:00 2 down* hopper[033,051]
condo up 2-00:00:00 23 mix hopper[017-025,028,032,036,038,045,055-063]
condo up 2-00:00:00 14 alloc hopper[013-016,029-030,035,042,047,049-050,052-054]
condo up 2-00:00:00 12 idle hopper[026-027,031,034,037,039-041,043-044,046,048]
geodef up 7-00:00:00 1 idle* hopper065
geodef up 7-00:00:00 1 alloc hopper047
geodef up 7-00:00:00 2 idle hopper[046,048]
biocomp up 7-00:00:00 1 alloc hopper052


From the output above we can see that one node (taos02) is allocated under a normal partition. Similarly, we can see that two nodes (taos01 and taos09) are in a mixed state meaning multiple users have resources allocated on the same node. The final line in the output shows that all other nodes (taos03-08) are currently idle.

`sinfo –N –l` provides more detailed information about individuals nodes including CPU count, memory, temporary disk space and so on.
From the output above, we can see that 2 nodes on the general partition are allocated, and 8 are idle. The corresponding node id's are also listed. In this case, hopper002 and hopper003 specifically are the ones which are allocated on the general partition.

`sinfo –N –l` provides more detailed information about each individual node, including CPU count, memory, temporary disk space and so on.

```
user@taos:~$ sinfo -N -l
Tue Feb 19 19:08:16 2019
[rdscher@hopper ~]$ sinfo -N -l
Wed Jun 05 05:58:34 2024
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
taos01 1 normal* mixed 80 2:20:2 386868 690861 10 (null) none
taos02 1 normal* allocated 40 2:10:2 64181 309479 1 (null) none
taos03 1 normal* idle 40 2:10:2 64181 309479 1 (null) none
taos04 1 normal* idle 40 2:10:2 64181 358607 1 (null) none
taos05 1 normal* idle 40 2:10:2 64181 309479 1 (null) none
taos06 1 normal* idle 40 2:10:2 64181 309479 1 (null) none
taos07 1 normal* idle 40 2:10:2 64181 309479 1 (null) none
taos08 1 normal* idle 40 2:10:2 64181 309479 1 (null) none
taos09 1 normal* mixed 40 2:10:2 103198 6864803 20 (null) none
```

More information regarding `sinfo` can be found by typing `man sinfo` at the command prompt while logged in to Taos.
hopper001 1 general* idle 32 2:16:1 95027 0 1 (null) none
hopper002 1 general* allocated 32 2:16:1 95027 0 1 (null) none
hopper003 1 general* allocated 32 2:16:1 95027 0 1 (null) none
hopper004 1 general* idle 32 2:16:1 95027 0 1 (null) none
hopper005 1 general* idle 32 2:16:1 95027 0 1 (null) none
hopper006 1 general* idle 32 2:16:1 95027 0 1 (null) none
hopper007 1 general* idle 32 2:16:1 95027 0 1 (null) none
hopper008 1 general* idle 32 2:16:1 95027 0 1 (null) none


More information regarding `sinfo` can be found by typing `man sinfo` at the command prompt while logged in to Hopper.

`squeue` provides information regarding currently running jobs and the resources allocated to those jobs.

```
user@taos:~$ squeue
[rdscher@hopper ~]$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
22632 normal BinPacke username PD 0:00 1 (Resources)
22548 normal minia username R 1-07:30:18 1 taos09
22562 normal unicycle username R 22:34:59 1 taos02
22567 normal sspace username R 22:00:50 1 taos01
22576 normal megahit username R 7:22:53 1 taos09
```
2554110 condo 200-7.00 lzhang R 4:52:39 1 hopper050
2462342 debug ethylben eattah PD 0:00 8 (PartitionNodeLimit)
2548555 general Ag-freq yinrr R 1-20:31:48 2 hopper[002-003]


The output from `squeue`shows you the JobID, the type of partition, the name of the job, which user owns the job, the total elapsed time of the job, how many nodes are allocated to that job, and which nodes those are.

There are additional flags you can add to the squeue command to help parse the squeue output for the information you need.
- `squeue --me` will only show you your jobs in the queue.
- `squeue -p general` will only show you jobs in the general partition
A list of all flags can be found with the `man squeue` command.

To cancel a job, you can use `scancel <JOBID>` where `<JOBID>` refers to the JobID assigned to your job by Slurm.
To cancel a job, you can use `scancel <JOBID>` where `<JOBID>` refers to the JobID assigned to your job by Slurm. Similar to the squeue command, you can also use the `scancel --me` to cancel all jobs you have scheduled.

## Slurm Job Submission

To submit a job in slurm you do so by submitting a shell script that outlines the resources you are requesting from the scheduler, the software needed for your job, and the commands you wish to run. The beginning of your submission scrip usually contains the #Hashbang specifying which interpreter should be used for the rest of the script, in this case we are using a `bash` shell as indicated by the code `#!/bin/bash`. The next portion of your submission script tells Slurm what resources you are requesting and is always preceeded by `#SBATCH` followed by flags for various parameters detailed below.


Example of a Slurm submission script : `slurm_submission.sh`
Example of a Slurm submission script : `slurm_submission.slurm`


```
#!/bin/bash
#
#SBATCH --job-name=demo
Expand All @@ -66,27 +77,23 @@ Example of a Slurm submission script : `slurm_submission.sh`
#SBATCH --ntasks=4
#SBATCH --time=00:10:00
#SBATCH --mem-per-cpu=100
#SBATCH --partition=partition_name
#SBATCH --partition=general

srun hostname
srun sleep
```

The above script is requesting from the scheduler an allocation of 4 nodes for 10 minutes with 100MB of ram per CPU. Note that we are requesting resources for four tasks, `--ntasks=4`, but not four nodes specifically. The default behavior of the scheduler is to provide one node per task, but this can be changed with the `--cpus-per-task` flag. Once the scheduler allocates the requested resources the job starts to run and the commands not preceeded by `#SBATCH` are interpreted and executed. The script first executes the `srun hostname` followed by `srun sleep` command.
The above script will request 4 cpu cores with 100MB of memory per cpu core. It will also choose the general partition. When we request in this way, the scheduler will give us the first 4 cpu cores it can find, whether that ends up being 1 cpu core per node across 4 nodes, or all 4 cores on the same node. You can add additional constraints, such as `ntasks-per-node`. Keep in mind that the more constraints you add, the longer your queue times may be. It might be the case that 4 cores spread across nodes becomes free instantly, while asking for all of those cores to be on the same node could take significantly longer.

The arguments `–-job-name` and `–-output` correspond to name of the job you are submitting and the name of the output file where the any output not defined by the program being executed is saved. For example, anything printed to `stdout` will be saved in your `--output` file.

Of note here is the `--partition=partition_name` (or `-p partition_name`) command. This command specifies which partition, or queue, to submit your job to. If you are a member of a specific partition you likely are aware of the name of your partition, however you can see which partition you have access to with the `sinfo` command. If you leave this blank you will be submitted to the default or community partition.
Of note here is the `--partition=general` (or `-p general`) command. This command specifies which partition, or queue, to submit your job to. If you are a member of a specific partition you likely are aware of the name of your partition, however you can see which partition you have access to with the `sinfo` command. If you leave this blank you will be submitted to the default or community partition.

To submit the job you execute the `sbatch` command followed by the name of your submission script, for example:

`sbatch submission.sh`
`sbatch submission.slurm`

Once you execute the above command the job is queued until the requested resources are available for to be allocated to your job.

Once your job is submitted you can use `sstat` command to see information about memory usage, CPU usage, and other metrics related to the jobs you own.


Below is an example of a Slurm submission script that runs a small python program that takes an integer as an argument, creates a random number matrix with the dimensions defined by the integer you provided, then inverts that matrix and writes it to a CSV file.

Below is our small python program named `demo.py` that we will be invoking.
Expand All @@ -113,7 +120,7 @@ Below is our small python program named `demo.py` that we will be invoking.
numpy.savetxt("%d.csv" % args.matrix, out, delimiter=",")


Below is the Slurm submission script to submit our python program named `submission_python.sh`. This job can be submitted by typing `sbatch submission_python.sh` at the command prompt. Note the `module load` command that loads the software environment that contains the `numpy` package necessary to run our program.
Below is the Slurm submission script to submit our python program named `submission_python.slurm`. This job can be submitted by typing `sbatch submission_python.slurm` at the command prompt. Note the `module load` command that loads the software environment that contains the `numpy` package necessary to run our program.

#!/bin/bash
#
Expand All @@ -123,12 +130,11 @@ Below is the Slurm submission script to submit our python program named `submiss
#SBATCH --ntasks=4
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100
#SBATCH --partition=ceti
#SBATCH --partition=debug

module load anaconda3
module load miniconda3
python demo.py 34

This brief tutorial should provide the basics necessary for submitting jobs to the Slurm Workload Manager on CARC machines.



*This quickbyte was validated on 6/5/2024*