SLURM Job Management (Queueing) System
The new system that will manage the user jobs on CARL and EDDY will be SLURM (formally known as Simple Linux Utility for Resource Management). SLURM is a free and open-source job scheduler for Linux and Unix-like kernels and is used on about 60% of the world's supercomputers and computer clusters. If you used the job scheduler (Sun Grid Engine or SGE) of FLOW and HERO, it will be easy to get used to SLURM because the concept of SLURM is quite similar.
SLURM provides three key functions:
- it allocates exclusive and/or non exclusive acces to resources (computer nodes) to users for some duration of time so they can perform work
- it provides a framework for starting, executing and monitoring work (typically a parallel job on a set of allocated nodes
- it arbitrates contetion of resources by managing a queue of pending work
Submitting Jobs
The following lines of code are an example job script. All it does is generating randoms numbers, saves them in random_numbers.txt and sorts them afterwards.
#!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks= 1 #SBATCH --mem=2G #SBATCH --time=0-2:00 #SBATCH --output=slurm.%j.out #SBATCH --error=slurm.%j.err #SBATCH --mail-type=END,FAIL #SBATCH --mail-user=your.name@uol.de for i in {1..100000}; do echo $RANDOM >> random_numbers.txt done sort random_numbers.txt
This batch script (or "job script") is used to set general options for sbatch. It has to contain options preceded with "#SBATCH" before any executable commands.
To submit your job you have to use following command:
sbatch myscript.job (if your script is named "myscript", of course)
Information on sbatch-options
The options in the example script shown above are common and should be used in all of your scripts (except the mail option).
- --nodes=<minnodes[-maxnodes]> or -N
- With this option you are requesting the nodes needed for your job. You can specify a minimum and maximum amount of nodes. If you only specify one number its used as both the minimum and maximum node count. If your node-limit defined in the job script is outside of the range permitted for its associated partition, the job will be left in a PENDING state. If it exceeds the actual amount of configured nodes in the partition, the job will be rejected.
- ntasks=<number> or -n
- Instead of launching tasks, sbatch requests an allocation of resources and submites a batch script. This option advises the Slurm controller that job steps run within the allocation will launch a maximum of number tasks and to provide for sufficient resources. The default value for ntasks is 1.
- --mem=<MB>
- You can specify the real memory required per node in megabytes. To keep the numbers small you can use the suffixes K (kb), M (Mb), G (Gb) and T (Tb), e.g. --mem=2G for 2GB memory.
- --time= or -t
- Use this to set a limit on the total runtime of the job allocation. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".
- --output=<filename pattern> or -o and error=<filename pattern> or -e
- By default, both standard output and standard error are directed to the same file. When using this option, you instruct Slurm to connect the batch script's standard output and standard error directly to the file name specified in the "filename pattern". The default file name is "slurm-%j.out" respectively "slurm-%j.err", where the %j is replaced by the job ID.
- --mail-type=<type>
- Its possible to inform the user by email if certain event types occur. Valied type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL, STAGE_OUT, TIME_LIMIT, TIME_LIMIT_90 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80 percent of time limit), TIME_LIMIT_50 (reached 50 percent of time limit) and ARRAY_TASKS (send emails for each array task). Multiple type values may be specified by separating them with commas.
- --mail-user=<user>
- Define a user to receive email notification of state changes as defined by --mail-type.
This is just a small part of all possibe option. A complete list with explanations can be found on the slurm homepage: sbatch