Large Job Arrays

From HPC users
Jump to navigationJump to search

Large Job Arrays in SLURM

SLURM has a restriction on the size and range of job arrays. Neither the size nor the upper limit of the range may exceed 1000. As a result

#SBATCH --array=1-1001

and

#SBATCH --array=10001-10100

will cause an error when the job script is submitted.

There are some possible workaround:

  1. allocate resources to run several tasks in parallel using srun
  2. run several tasks one after another in a single job array instance
  3. submit a chain job as a task array that repeatedly calls itself

All of the methods require some extra work but produce the desired outcome