Difference between revisions of "Partitions"

Revision as of 13:57, 25 May 2020

To manage the large amount of nodes on CARL and EDDY, its important to work with partitions. Partitions will optimize the way resources are given out to users and to ensure that everyone can submit jobs and receive results as fast as possible.

Since we are using SLURM as our cluster manager and job scheduling system, informations about partions can be displayed by using the command:

sinfo

The command "sinfo" has many possible options. Some important ones are:

-a, --all

Display information about all partitions. You will even see partitions that are not available for your group and hidden partitions.

-l, --long

Display more detailed informations about the available partitions.

-N, --Node

Display a list of every available node.

-n <nodes>, --nodes=<nodes>

Display informations about a specific node. Multiple nodes may be comma separated. You can even specify a range of nodes, e.g. mpcs[100-120].

-O <output_format>, --Format=<output_format>

Specify the information you want to be displayed.

If you want to, for example, display the node hostname, the number of CPUs, the CPU load, the amount of free memory, the size of temporary disk, the size of memory per node (in megabytes) you could use the following command:

sinfo -O nodehost,cpus,cpusload,freemem,disk,memory
HOSTNAMES           CPUS                CPU_LOAD            FREE_MEM            TMP_DISK            MEMORY
cfdh076             24                  1.01                97568               115658              128509
.
.
.

The size of each field can be modified (syntax: "type[:[.]size]") to match your needs, for example like this:

sinfo -O nodehost:8,cpus:5,cpusload:8,freemem:10,disk:10,memory:8
HOSTNAMECPUS CPU_LOADFREE_MEM  TMP_DISK  MEMORY
cfdh076 24   1.01    97568     115658    128509
.
.
.

The full list and further informations about the command "sinfo" can be found here: sinfo

Usage of the Partitions on CARL/EDDY

To optimize the submission, the runtime and the overall usage for everybody using the cluster, you should always specify the right partition for your jobs. Using either the carl.p- or the eddy.p-partition is always a good choice. Try to avoid using the all_nodes.p-partition. Only use it if the other partitions dont have enough nodes and the runtime of your job doesnt exceed 1 day.

Partitions on CARL/EDDY

Using sinfo on CARL/EDDY will display an output like this:

Informations you can see in the screenshot described by columns:

PARTITIONS
Name of a partition, e.g. carl.p or eddy.p
AVAIL
State of the partitions (up or down)
TIMELIMIT
Maximum time limit for any user job in days-hours:minutes:seconds.
NODES
Count of nodes with this particular configuration.
STATE
Current state of the node. Possible states are: allocated, completing, down, drained, draining, fail, failing, future, idle, maint, mixed, perfctrs, power_down, power_up, reserved and unknown
NODELIST
Names of nodes associated with this configuration/partition.

The full description of the output field can be found here: output field

Partitions in Summary

If you just need a quick summary of the most important values, then this table might be sufficient.

CARL
Partition	NodeType	Count	CPUs	Default RunTime	Default Memory	Misc
mpcs.p	MPC-STD	158	24	2h	10 375M
mpcl.p	MPC-LOM	128	24		5 000M
mpcb.p	MPC-BIG	30	16		30G	GTX 1080 (4 nodes á 2 GPUs)
mpcp.p	MPC-PP	2	40		50G
mpcg.p	MPC-GPU	9	24		10 375M	1-2x Tesla P100 GPU
carl.p	Combines mpcl.p and mpcs.p, defaults are as for mpcl.p
EDDY
cfdl.p	CFD-LOM	160	24	2h	2 600M
cfdh.p	CFD-HIM	81	24		5 000M
cfdg.p	CFD-GPU	3	24		10G	1x Tesla P100 GPU
eddy.p	Combines cfdl.p and cfdh.p, defaults are as for cfdl.p

Using GPU Partitions

When using GPU partitions, it is mandatory to use the following GPU flag on your slurm jobs, should you need to allocate the GPUs:

--gres=gpu:1 - This will request one GPU per Node.

--gres=gpu:2 - This will request two GPUs per Node.

To learn more about submitting jobs, you might want to take a look at this page.

@@ Line 71: / Line 71: @@
 {| class="wikitable"
 |-
-!colspan="6" style="background-color:#6B8E23;"| CARL
+!colspan="7" style="background-color:#6B8E23;"| CARL
 |-
-! Partition            !! NodeType        !!style="text-align:center"| CPUs   !!Default RunTime       !! Default Memory      !! Misc
+! Partition            !! NodeType        !!style="text-align:center"| Count !!style="text-align:center"| CPUs   !!Default RunTime       !! Default Memory      !! Misc
 |-
-| mpcs.p               ||MPC-STD          ||style="text-align:center"|24      || style="text-align:center" rowspan="5"|2h    || style="text-align:center"|10 375M             ||
+| mpcs.p               ||MPC-STD          ||style="text-align:center"| 158 ||style="text-align:center"|24      || style="text-align:center" rowspan="5"|2h    || style="text-align:center"|10 375M             ||
 |-
-| mpcl.p               ||MPC-LOM          ||style="text-align:center"|24                               ||style="text-align:center"|5 000M               ||
+| mpcl.p               ||MPC-LOM          ||style="text-align:center"| 128 ||style="text-align:center"|24                               ||style="text-align:center"|5 000M               ||
 |-
-| mpcb.p               ||MPC-BIG          ||style="text-align:center"|16                               ||style="text-align:center"|30G                  ||GTX 1080 (4 nodes á 2 GPUs)
+| mpcb.p               ||MPC-BIG          ||style="text-align:center"| 30 ||style="text-align:center"|16                               ||style="text-align:center"|30G                  ||GTX 1080 (4 nodes á 2 GPUs)
 |-
-|mpcp.p                ||MPC-PP           ||style="text-align:center"|40                               ||style="text-align:center"|50G                  ||
+|mpcp.p                ||MPC-PP           ||style="text-align:center"| 2 ||style="text-align:center"|40                               ||style="text-align:center"|50G                  ||
 |-
-|mpcg.p                ||MPC-GPU          ||style="text-align:center"|24                                ||style="text-align:center"|10 375M              ||1-2x Tesla P100 GPU
+|mpcg.p                ||MPC-GPU          ||style="text-align:center"| 9 ||style="text-align:center"|24                                ||style="text-align:center"|10 375M              ||1-2x Tesla P100 GPU
 |-
-|carl.p                || colspan="5"| Combines mpcl.p and mpcs.p, defaults are as for mpcl.p
+|carl.p                || colspan="6"| Combines mpcl.p and mpcs.p, defaults are as for mpcl.p
 |-
-!colspan="6" style="background-color:#6B8E23;" |EDDY
+!colspan="7" style="background-color:#6B8E23;" |EDDY
 |-
-|cfdl.p                ||CFD-LOM           ||style="text-align:center"|24      || style="text-align:center" rowspan="3"|2h    || style="text-align:center"|2 600M
+|cfdl.p                ||CFD-LOM          ||style="text-align:center"| 160 ||style="text-align:center"|24      || style="text-align:center" rowspan="3"|2h    || style="text-align:center"|2 600M
 |-
-|cfdh.p                ||CFD-HIM          ||style="text-align:center"|24                                  ||style="text-align:center"|5 000M
+|cfdh.p                ||CFD-HIM          ||style="text-align:center"| 81 ||style="text-align:center"|24                                  ||style="text-align:center"|5 000M
 |-
-|cfdg.p                ||CFD-GPU          ||style="text-align:center"|24                                  ||style="text-align:center"|10G                 ||    1x Tesla P100 GPU
+|cfdg.p                ||CFD-GPU          ||style="text-align:center"| 3 ||style="text-align:center"|24                                  ||style="text-align:center"|10G                 ||    1x Tesla P100 GPU
 |-
-|eddy.p                || colspan="5"| Combines cfdl.p and cfdh.p, defaults are as for cfdl.p
+|eddy.p                || colspan="6"| Combines cfdl.p and cfdh.p, defaults are as for cfdl.p
 |}
 === Using GPU Partitions ===

Difference between revisions of "Partitions"

Revision as of 13:57, 25 May 2020

Contents

Usage of the Partitions on CARL/EDDY

Partitions on CARL/EDDY

Partitions in Summary

Using GPU Partitions

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Topics

Tools