Information on used Resources

From HPC users
Jump to navigationJump to search

Getting Information about Job Resources

It can be useful to know how many resources a job has used, for example to adjust the requested resources of a similar following job. The sacct-command can be used for this purpose. Information about a specific job are obtained with the option -j <job-id>, e.g.:

$ sacct -j 21400
       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode 
------------ ---------- ---------- ---------- ---------- ---------- -------- 
21400           g09test     carl.p                     8  COMPLETED      0:0 
21400.batch       batch                                8  COMPLETED      0:0

The command returns information about the job itself (first line) and individual job steps (following lines). A job step is defined by the srun-command used within a job script. In addition, the job script itself is shown as a special step in the line <job-id>.batch (in the example no srun-job steps are present).

To get more detail information you can use the --format=-option. A useful command is

$ sacct -j 21400 --format=JobID,User,Node,AllocTRES%30,Elapsed,MaxRSS
       JobID      User        NodeList                      AllocTRES    Elapsed     MaxRSS 
------------ --------- --------------- ------------------------------ ---------- ---------- 
21400         abcd1234         mpcl001         cpu=8,mem=3994M,node=1   00:02:11            
21400.batch                    mpcl001         cpu=8,mem=3994M,node=1   00:02:11   4048152K 

which lists the JobID, the user name, a node list, the resources allocated for the job, the elapsed time (wallclock time), and the memory used (MaxRSS). In the example, the used memory of 4048152K (=3953.3M) is just a little smaller than the request amount of 3994M.