This page has information on how to use Slurm to submit, manage, and analyze jobs.
...
This page itself is modeled after the excellent CÉCI Slurm tutorial.
Gathering Information
Slurm offers a variety of commands to query the nodes, which can provide a snapshot of the overall computational ecosystem, list jobs in process or that are queued up, and more.
sinfo
The sinfo
command lists available partitions and some basic information about each. A partition is a logical grouping of physical compute nodes. Running sinfo
produces output similar to this; the list is dynamic and represents a current snapshot of which partitions are available, which systems comprise a given partition, and an idea of the availability of those systems:
Code Block | ||
---|---|---|
| ||
[jsimms1@strelka ~]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST compute* up infinite 4 alloc himem[02-03],node[01,06] compute* up infinite 5 resv himem01,node[02-03,05,07] compute* up infinite 1 mix node04 himem up infinite 2 alloc himem[02-03] himem up infinite 1 resv himem01 hicpu up infinite 1 mix hicpu01 gpu up infinite 2 mix gpu[01-02] interactive up 5:00:00 2 mix gpu[01-02] |
squeue
The command squeue
displays a list of jobs that are currently running (denoted with R) or that are pending (denoted with PD). Here is example output:
...
Creating and Submitting Jobs
Slurm offers two primary ways to submit jobs to the compute nodes: interactive and batch. Interactive is the simpler method, but its usefulness is somewhat limited and is generally used to work with software interactively. Batch is more complex and requires greater planning, but it is by far the most common use of Slurm and provides a great deal of flexibility and power.
Interactive
Command line
The simplest way to connect to a set of resources is to request an interactive shell, which can be accomplished with the salloc
command. Here is a basic example:
...
If a virtual desktop is preferred, or is required to run a GUI program, a second option is to request an interactive session through Open OnDemand.
Batch
The most common way to work with Slurm is to submit batch jobs and allow the scheduler to manage which resources are used, and at which times. So, what then, exactly, is a job? A job has two separate parts:
- a resource request, which requests things like required cores, memory, GPUs, etc.
- a list of one or more job steps, which are basically the individual commands to be run sequentially to perform the actual tasks of the job.
The best way to manage these two parts is within a single submission script that Slurm uses to allocate resources and process your job steps. Here is an extremely basic sample submission script (we’ll name it sample.sh
):
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash #SBATCH --job-name=sample #SBATCH --output=/home/username/samplejob/output/output_%j.txt #SBATCH --partition=unowned #SBATCH --time=1:00:00 #SBATCH --ntasks=1 #SBATCH --mem-per-cpu=100mb #SBATCH --mail-type=BEGIN,END,FAIL,REQUEUE #SBATCH --mail-user=username@swarthmore.edu cd $HOME/samplejob srun my_code.sh |
Following the first (shebang) line are any number of SBATCH
directives, which handle the resource request and other data (e.g., job name, output file location, and potentially many other options) associated with your job. These all must appear at the top of the file, prior to any job steps. In this file, multiple #SBATCH
directives define the job:
Setting | Meaning | Value |
---|---|---|
#SBATCH --job-name=sample | Provide a short-ish descriptive name for your job | sample |
#SBATCH --output=/home/username/samplejob/output/output_%j.txt | Where to save output from the job; note that any content that normally would be output to the terminal will be saved in the file. | /home/username/samplejob/output/output_%j.txt (%j will be replaced by the job number assigned by Slurm; note that Slurm will default to producing an output file in the directory from which the job is submitted) |
#SBATCH --partition=unowned | Which partition to use | unowned |
#SBATCH --time=1:00:00 | Time limit of the job | 1:00:00 (one hour) |
#SBATCH --ntasks=1 | Number of CPU cores to request | 1 (this can be increased if your code can leverage additional cores) |
#SBATCH --mem-per-cpu=100mb | How much memory to request | 100mb (this is per-core and can be expressed in gb, etc.) |
#SBATCH --mail-type=BEGIN,END,FAIL,REQUEUE | Decide when to receive an email | BEGIN,END,FAIL,REQUEUE (this will send an email when the job actually starts running, when it ends, if it fails, and if the job is requeued) |
#SBATCH --mail-user=username@swarthmore.edu | Email address | username@swarthmore.edu |
After the parameters are set, the commands to run the code are added. Note that this is effectively a modified shell script, so any commands that work in such scripts will typically work. It is important, however, to precede any actual commands with srun
.
Parallel batch
While serial or interactive jobs are incredibly useful, one of the greatest advantages that working on a cluster provides is relatively simple management of parallel tasks. You can, with a comparatively simple script, manage thousands of runs of a program over several hours or days, or run the same program, using a different input file each time, automatically. Many types of parallel jobs are possible, such as parameter sweeps, MPI, and OpenMP. Please note that the examples below are, indeed, merely examples, and are meant to be as basic as possible to demonstrate what is possible. Actual scripts could be much more complex, depending on a particular use case.
You can use the --array
flag in the submission script to generate a job array and to populate a special variable ($SLURM_ARRAY_TASK_ID
) to pass an integer to your program, which might control various options. The following code would run eight distinct jobs: my_script 1
, my_script 2
, etc., but note that they could execute in any order (e.g., 2,3,1,4,5,6,8,7).
Code Block | ||
---|---|---|
| ||
#!/bin/bash
#
#SBATCH --job-name=param_sweep
#SBATCH --output=param_sweep_output.txt
#
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100mb
#
#SBATCH --array=1-8
srun ./my_script $SLURM_ARRAY_TASK_ID |
This approach can also be used to process multiple data files. The FILES=
command creates an array of all files in a particular directory, which you can then pass to my_script
by iterating through them using the current integer value of $SLURM_ARRAY_TASK_ID:
Code Block | ||
---|---|---|
| ||
#!/bin/bash
#
#SBATCH --job-name=param_sweep
#SBATCH --output=param_sweep_output.txt
#
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100mb
#
#SBATCH --array=0-7
FILES=(/path/to/data/*)
srun ./my_script ${FILES[$SLURM_ARRAY_TASK_ID]} |
A similar approach allows you to pass non-integer values to your program; non-numeric arguments can also be passed simply by populating them within the ARGS
array: ARGS=("red" "blue" "green")
. In addition, it is possible to use non-sequential integers: --array=0,3,4,9-22
or --array=0-12:4
(equivalent to --array=0,4,8,12
):
Code Block | ||
---|---|---|
| ||
#!/bin/bash
#
#SBATCH --job-name=param_sweep
#SBATCH --output=param_sweep_output.txt
#
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100mb
#
#SBATCH --array=0-2
ARGS=(0.05 0.76 1.28)
srun ./my_script ${ARGS[$SLURM_ARRAY_TASK_ID]} |
If the running time of an individual job is about 10 minutes or less, however, using a job array may introduce unnecessary overhead; instead, you can loop through files manually. This code will loop through all files within a given directory, processing up to eight at a time:
Code Block | ||
---|---|---|
| ||
#! /bin/bash
#
#SBATCH --ntasks=8
for file in /path/to/data/*
do
srun -n1 --exclusive ./my_script $file &
done
wait |
A variant allows you to send in, e.g., integers to your program, again as an example here 8 at a time:
...
theme | RDark |
---|
...
Please see this page for information about submitting batch or array jobs.
Submit a job to the queue
...