Difference between revisions of "Interactive Jobs"

From HPC users
Jump to navigationJump to search
Line 13: Line 13:
  hpcl001$ srun ./mpi_program [options]
  hpcl001$ srun ./mpi_program [options]
With the combination of <tt>salloc</tt> and <tt>srun</tt>, you can lauch MPI-parallel applications interactively, which can be useful for testing again. Again, you can change the resources, such as memory and runtime, in the usual way. <tt>srun</tt> replaces the maybe more familiar <tt>mpirun</tt> here and makes sure, that the MPI program is running on the allocated ressources. Please note, that Intel MPI requires you to set
With the combination of <tt>salloc</tt> and <tt>srun</tt>, you can lauch MPI-parallel applications interactively, which can be useful for testing again. Again, you can change the resources, such as memory and runtime, in the usual way. <tt>srun</tt> replaces the maybe more familiar <tt>mpirun</tt> here and makes sure, that the MPI program is running on the allocated ressources. Please note, that Intel MPI requires you to set
  hpcl001$ export I_MPI_PMI_LIBRAY
  hpcl001$ export I_MPI_PMI_LIBRARY=/cm/shared/apps/slurm/current/lib64/libpmi.so
to work properly (see also [[Parallel Jobs]]


== Interactive Login with X11-forwarding ==
== Interactive Login with X11-forwarding ==

Revision as of 14:20, 31 July 2019

Interactive jobs should not be used unless needed for a specific reason. If you have to use an interactive session on the compute nodes please keep it short (do not request more than 8h per session) and log out as soon as you done with the interactive work!

Interactive Login

Method 1: Start a bash shell on a compute node and reserve four CPUs/cores (with default run time and memory):

hpcl001$ srun --pty -p carl.p --ntasks=1 --cpus-per-task=4 bash
mpcs001$ <execute commands on the compute node>

Of course, you can change the requested resources with the usual command-line options, e.g. to request more memory or even GPUs (with --gres=gpu:1 and the appropriate partition). A typical use case could be the extensive testing of a thread-parallel program.

Method 2: Allocate resources on the compute nodes and then run applications interactivly with srun:

hpcl001$ salloc -p carl.p --nodes=2 --ntasks-per-node=24
hpcl001$ srun ./mpi_program [options]

With the combination of salloc and srun, you can lauch MPI-parallel applications interactively, which can be useful for testing again. Again, you can change the resources, such as memory and runtime, in the usual way. srun replaces the maybe more familiar mpirun here and makes sure, that the MPI program is running on the allocated ressources. Please note, that Intel MPI requires you to set

hpcl001$ export I_MPI_PMI_LIBRARY=/cm/shared/apps/slurm/current/lib64/libpmi.so

to work properly (see also Parallel Jobs

Interactive Login with X11-forwarding

If you need graphical output during an interactive session things become more complicated. SLURM does not natively support X11-forwarding for jobs running on the compute nodes. Here is a work around that can be used:

First, login to the cluster with X11-fowarding enabled:

ssh -X abcd1234@carl.hpc.uni-oldenburg.de

Next, get a copy of srun.x11 with the command

git clone https://github.com/jbornschein/srun.x11.git

which will create a directory srun.x11 with some files in it (you may want to cd to a preferred directory before using git clone). One of the files is named srun.x11 and it is recommended to modify it ((-) old line, (+) new line) as follows:

(-) trap "{ /usr/bin/scancel -Q $JOB; exit; }" SIGINT SIGTERM EXIT
(+) trap "{ scancel -Q $JOB; exit; }" SIGINT SIGTERM EXIT

and

(-)    sleep 1s
(+)    sleep 2s

After that you can create an interactive session using e.g. the command

/path/to/srun.x11 -p carl.p -n 1

You can add any of the options from the sbatch command you would like to use (e.g. --gres=gpu:1 if you also use partition mpcg.p). In the interactive session, this

$ module load gnuplot
$ gnuplot
> plot sin(x)

should open a windows showing the plot of sin(x) on your machine (if you have problems loading module a module restore may help).