Difference between revisions of "SLURM Job Management (Queueing) System"

From HPC users
Jump to navigationJump to search
Line 55: Line 55:


:You can specify the real memory required per node in megabytes. To keep the numbers small you can use the suffixes ''K'' (kb), ''M'' (Mb), ''G'' (Gb) and ''T'' (Tb), e.g. --mem=2G for 2GB memory.
:You can specify the real memory required per node in megabytes. To keep the numbers small you can use the suffixes ''K'' (kb), ''M'' (Mb), ''G'' (Gb) and ''T'' (Tb), e.g. --mem=2G for 2GB memory.
:'''Important Note''': its no longer possible to add uneven numbers like e.g. --mem=6.4GB to your jobscript. You would now convert it to MB: --mem=6400M.


;'''--mem-per-cpu=''<MB'''''
;'''--mem-per-cpu=''<MB'''''

Revision as of 14:29, 27 February 2017

The new system that will manage the user jobs on CARL and EDDY will be SLURM (formally known as Simple Linux Utility for Resource Management).

Slurf Workload Manager.png

SLURM is a free and open-source job scheduler for Linux and Unix-like kernels and is used on about 60% of the world's supercomputers and computer clusters. If you used the job scheduler of FLOW and HERO (Sun Grid Engine or SGE), it will be easy to get used to SLURM because the concept of SLURM is quite similar.

SLURM provides three key functions:

  • it allocates exclusive and/or non exclusive acces to resources (computer nodes) to users for some duration of time so they can perform work
  • it provides a framework for starting, executing and monitoring work (typically a parallel job on a set of allocated nodes
  • it arbitrates contetion of resources by managing a queue of pending work


Submitting Jobs

The following lines of code are an example job script. All it does is generating randoms numbers, saves them in random_numbers.txt and sorts them afterwards.

#!/bin/bash

#SBATCH --nodes=1                    
#SBATCH --ntasks=1                  
#SBATCH --mem=2G                  
#SBATCH --time=0-2:00                
#SBATCH --output=slurm.%j.out        
#SBATCH --error=slurm.%j.err          
#SBATCH --mail-type=END,FAIL         
#SBATCH --mail-user=your.name@uol.de 

for i in {1..100000}; do
echo $RANDOM >> random_numbers.txt
done

sort random_numbers.txt

This sbatch script (or "job script") is used to set general options for sbatch. It has to contain options preceded with "#SBATCH" before any executable commands.

To submit your job you have to use following command:

sbatch -p carl.p myscript.job (if your script is named "myscript", of course)

You have to add a partition to the sbatch-command (with "-p"). For tutorial purposes we are using the "carl"-partition, if you are submitting real jobs you should always specify a fitting partition for your needs. You can see all possible partitions with the command

sinfo

Further information about the command "sinfo" can be found here: sinfo

Information on sbatch-options

The options in the example script shown above are common and should be used in all of your scripts (except the mail option).

--nodes=<minnodes[-maxnodes]> or -N
With this option you are requesting the nodes needed for your job. You can specify a minimum and maximum amount of nodes. If you only specify one number its used as both the minimum and maximum node count. If your node-limit defined in the job script is outside of the range permitted for its associated partition, the job will be left in a PENDING state. If it exceeds the actual amount of configured nodes in the partition, the job will be rejected.
ntasks=<number> or -n
Instead of launching tasks, sbatch requests an allocation of resources and submites a batch script. This option advises the Slurm controller that job steps run within the allocation will launch a maximum of number tasks and to provide for sufficient resources. The default value for ntasks is 1.
--mem=<MB>
You can specify the real memory required per node in megabytes. To keep the numbers small you can use the suffixes K (kb), M (Mb), G (Gb) and T (Tb), e.g. --mem=2G for 2GB memory.
Important Note: its no longer possible to add uneven numbers like e.g. --mem=6.4GB to your jobscript. You would now convert it to MB: --mem=6400M.
--mem-per-cpu=<MB
TEXT
--time= or -t
Use this to set a limit on the total runtime of the job allocation. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".
--output=<filename pattern> or -o and error=<filename pattern> or -e
By default, both standard output and standard error are directed to the same file. When using this option, you instruct Slurm to connect the batch script's standard output and standard error directly to the file name specified in the "filename pattern". The default file name is "slurm-%j.out" respectively "slurm-%j.err", where the %j is replaced by the job ID.
--mail-type=<type>
Its possible to inform the user by email if certain event types occur. Valied type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL, STAGE_OUT, TIME_LIMIT, TIME_LIMIT_90 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80 percent of time limit), TIME_LIMIT_50 (reached 50 percent of time limit) and ARRAY_TASKS (send emails for each array task). Multiple type values may be specified by separating them with commas.
--mail-user=<user>
Define a user to receive email notification of state changes as defined by --mail-type.

This is just a small part of all possibe options. A complete list with explanations can be found on the slurm homepage.

Documentation

If you want to learn more about the SLURM Management System you can visit the documentation page on the official homepage of SLURM.