Getting Job Efficiency¶
To optimize the clusters usage, it is necessary that your reservation fits the memory, the CPU and the expected time to the needs of your job.
You can check the usage of memory and CPU for either running or completed jobs to calculate their efficiency.
Running jobs¶
For running jobs, use squeue
to get the compute node name where your job is
running.
$ squeue -u $USER
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
202232023 batch numpy ceciuser R 0:36 1 'NODEID'
In this case, it is NODEID. Then use the top
command through an ssh
connexion to the compute node to get the CPU usage and the process ID:
$ ssh -t NODEID top -u $USER -b -n 1
Warning: Permanently added NODEID,192.168.1.107 (ECDSA) to the list of known hosts.
top - 17:23:27 up 262 days, 2:01, 0 users, load average: 29.59, 30.05, 32.05
Tasks: 478 total, 26 running, 238 sleeping, 0 stopped, 0 zombie
%Cpu(s): 57.4 us, 0.6 sy, 0.0 ni, 41.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 26379747+total, 21854316+free, 16868692 used, 28385624 buff/cache
KiB Swap: 4194300 total, 3826884 free, 367416 used. 24502859+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
'11154' ceciuser 20 0 1504856 1.027g 14964 R '752.9' 0.4 6:40.75 python
11233 ceciuser 20 0 172696 5092 4164 R 5.9 0.0 0:00.02 top
11130 ceciuser 20 0 113856 3668 2888 S 0.0 0.0 0:00.01 slurm_scr+
11232 ceciuser 20 0 168620 4596 3280 S 0.0 0.0 0:00.00 sshd
In this example the CPU usage is 752.9% and the process ID (PID) is 11154. Use the PID to get the memory usage with this command
$ ssh NODEID cat /proc/11154/status | grep VmRSS
Warning: Permanently added NODEID,192.168.1.107 (ECDSA) to the list of known hosts.
VmRSS: '1076752 kB'
The CPU efficiency is 752.9÷800×100 = 94.1% and the memory efficiency is (1076752÷1024)÷1056×100 = 99.6%
In this example, the reservation was 8 CPU and 1056 MB
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=132m
Completed jobs¶
For completed jobs you can use the seff
command to get the efficiency, using the
job ID as argument.
$ seff 202232023
Job ID: 202232023
Cluster: clusername
User/Group: ceciuser/ceciuser
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 8
CPU Utilized: 00:16:25
CPU Efficiency: '76.95%' of 00:21:20 core-walltime
Job Wall-clock time: 00:02:40
Memory Utilized: 1.03 GB
Memory Efficiency: '99.91%' of 1.03 GB
In the job example, after completion, we get 76.95% CPU efficiency and 99.91% memory efficiency.
Final remarks¶
You can also get the efficiency of a running or completed job with the sacct command.
$ sacct -j 202232023 -o "User,JobID%20,ReqMem,ReqCPUS,TotalCPU,Elapsed,MaxRSS,State"
User JobID ReqMem ReqCPUS TotalCPU Elapsed MaxRSS State
--------- -------------------- ---------- -------- ---------- ---------- ---------- ----------
ceciuser 202232023 132Mc 8 16:25.207 00:02:40 COMPLETED
202232023.batch 132Mc 8 16:25.207 00:02:40 1080420K COMPLETED
MaxRSS/(ReqMem*ReqCPUS)
gives the maximum of memory efficiency. In this example, (1080420)÷(132×8×1024)*100 = 99.91%TotalCPU/(Elapsed*ReqCPUS)
gives the CPU efficiency. In this example, (16×60+25,207)÷((2×60+40)×8)×100 = 76.95%
Note
Slurm checks periodically for the memory usage to get the “Maximum resident set size” of all tasks in job. If your code has a short peak usage of memory slurm will not see it so the value will be underestimated.
If your memory efficiency is bad, you should set the requested memory a little larger than the MaxRSS. Also, see if it is possible to estimate the memory usage based on a parameter, such as the grid size, matrix size, size of the big bunch of data read from a file ...
Several reasons can be the cause of a bad CPU efficiency
Some jobs have a pre-compute or post-compute part that uses less CPU. You can see if it is possible to split the calculations into several dependent jobs.
Some jobs creates multiple threads but only some of them are in “Running” status and the others are in “Sleep” status. This may be checked periodically while the job is running with the
ssh -t NODEID top -H -n 1 -p PID
command, where PID is the process ID (see above).$ ssh -t NODEID top -H -n 1 -p 11154 Warning: Permanently added NODEID,192.168.1.107 (ECDSA) to the list of known hosts. top - 09:36:43 up 62 days, 22:10, 1 user, load average: 23.14, 23.09, 23.08 Threads: 4 total, 3 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 64.8 us, 0.8 sy, 0.0 ni, 28.2 id, 6.2 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 26378952+total, 18337344+free, 46629468 used, 33786600 buff/cache KiB Swap: 4194300 total, 4111604 free, 82696 used. 21539291+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 113551 ceciuser 20 0 4826492 3.1g 2364 R 99.9 1.2 653:34.06 python 11154 ceciuser 20 0 4826492 3.1g 2364 D 99.9 1.2 1106:26 python 113552 ceciuser 20 0 4826492 3.1g 2364 R 99.9 1.2 653:18.76 python 113553 ceciuser 20 0 4826492 3.1g 2364 R 99.9 1.2 652:35.98 python
In this example, 3 threads are running and 1 is sleeping, as indicated by the third line of the output.
You can reduce the number of reserved CPU to the number of running threads.