jobstats

Overview

The Princeton Jobstats platform provides profile and summary information for jobs on FASRC Clusters. This allows for greater insight into job performance than the standard Slurm commands. It is highly encouraged to use jobstats over the older seff command, especially as jobstats gives information on GPU usage. Jobstats works for both running and completed jobs, but does not work for jobs that last for under a minute.

Command

To use jobstats run:

jobstats JOBID

You will then get a summary of your job:

[jharvard@boslogin05 ~]# jobstats 12345678

================================================================================
                              Slurm Job Statistics 
================================================================================
Job ID: 12345678
User/Account: jharvard/jharvard_lab
Job Name: gpu_example
State: COMPLETED
Nodes: 1
CPU Cores: 32
CPU Memory: 200GB (6.2GB per CPU-core)
GPUs: 1
QOS/Partition: normal/gpu_h200
Cluster: odyssey
Start Time: Tue Nov 25, 2025 at 10:52 AM
Run Time: 02:59:53
Time Limit: 1-00:00:00

                              Overall Utilization 
================================================================================
CPU utilization  [|                                               3%]
CPU memory usage [                                                0%]
GPU utilization  [||||||||||||||||||||||||||||||||||||||||||||||100%]
GPU memory usage [|||||||||||||||                                31%]

                             Detailed Utilization 
================================================================================
CPU utilization per node (CPU time used/run time)
    holygpu8a12103: 03:00:12/3-23:56:16 (efficiency=3.1%)

CPU memory usage per node - used/allocated
    holygpu8a12103: 431.3MB/200GB (13.5MB/6.2GB per core of 32)

GPU utilization per node
    holygpu8a12103 (GPU 1): 100%

GPU memory usage per node - maximum used/total
    holygpu8a12103 (GPU 1): 44.0GB/140.4GB (31.3%)

                                  Notes 
================================================================================
* The max Memory utilization of this job is 0%. This value is low compared
  to the target range of 80% and above. Please investigate the reason for
  the low efficiency. For more info:
    https://docs.rc.fas.harvard.edu/kb/job-efficiency-and-optimization-best-practices/#Memory

* Have a nice day!

The summary provided gives you an overview of your job performance including a break down per node. In addition the command will flag under performance in red and point you to relevant documentation that you can use to improve your job efficiency. For example, when the user asked for 200GB of memory but used only 1GB, in future runs they should ask for 1GB of memory instead. Other items not flagged but worth adjusting would be to drop the number of cores to a single core and to reduce the requested time to 4 hours instead of a day. These changes would allow the job to run more efficiently, lowering impact on fairshare, and freeing resources for other users.

Note that for CPU utilization the CPU time used/run time factor is effectively the the amount of time used multiplied by the number of cores. In an ideal run your CPURuntime = NCPUS * Elapsed (wall-clock time). In this case the job ran for almost 3 hours on 32 cores which gives 4 days of CPURuntime, but it only actually used 3 hours across all its cores, this effectively means its using only 1 core. Hence in future runs you would only want to ask for one core, or figure out why the core is not parallelizing.

Jobstats Dashboard

To see a profile for a job you can use the Single Job Stats Dashboard (note: Need to be on FASRC VPN to access). Fill in your JobID and select which cluster you are using (note: For Cannon cluster you will want to select “odyssey” which is the old name for the cluster). You will then want to select the time range when your job ran to see the profile. You can even focus in on specific nodes if you want to see the profile.

Jobstats Emails

Slurm will put the results of jobstats into your completion emails. To subscribe add --mail-type=END, or options that include END, to your submission script. Email by default is sent to the email you have listed with us.

What should my job utilization be?

You can find target usage for memory, cpu, and gpu usage in the jobstats output. If your job underutilized resources, the “Notes” section will show you target ranges for each resource (cpu, memory, and gpu). See this cpu job:

[jharvard@holylogin07 ~]$ jobstats 49081039

================================================================================
Slurm Job Statistics
================================================================================
Job ID: 49081039
User/Account: jharvard/jharvard_lab
Job Name: .fasrcood/sys/dashboard/sys/RemoteDesktop
State: TIMEOUT
Nodes: 1
CPU Cores: 4
CPU Memory: 24GB (6GB per CPU-core)
QOS/Partition: normal/test
Cluster: odyssey
Start Time: Fri Dec 5, 2025 at 8:45 AM
Run Time: 01:00:12
Time Limit: 01:00:00

                            Overall Utilization
================================================================================
CPU utilization [                                                    0%]
CPU memory usage [|                                                  2%]

                           Detailed Utilization
================================================================================
CPU utilization per node (CPU time used/run time)
holy8a24102: 00:00:39/04:00:48 (efficiency=0.3%)

CPU memory usage per node - used/allocated
holy8a24102: 591.5MB/24GB (147.9MB/6GB per core of 4)

                                   Notes
================================================================================
* The overall CPU utilization of this job is 0.3%. This value is low
compared to the target range of 90% and above. Please investigate the
reason for the low efficiency. For instance, have you conducted a scaling
analysis? For more info:
https://docs.rc.fas.harvard.edu/kb/job-efficiency-and-optimization-best-practices/#Cores

* The max Memory utilization of this job is 2%. This value is low compared
to the target range of 80% and above. Please investigate the reason for
the low efficiency. For more info:
https://docs.rc.fas.harvard.edu/kb/job-efficiency-and-optimization-best-practices/#Memory

* This job failed because it exceeded the time limit. If there are no other
problems then the solution is to increase the value of the --time Slurm
directive and resubmit the job. For more info:
https://docs.rc.fas.harvard.edu/kb/job-efficiency-and-optimization-best-practices/#Time

* Have a nice day!

The “Notes” section explains what was underutilized and the target range.

In the “Detailed Utilization” section, you can see values per core (and per node for a multi-node job). In this case, jharvard could have requested fewer cores and less memory. The job used ~600MB out of the 24GB requested. Instead, jharvard should have requested 750MB of memory (80% of 750=600MB). In terms of cores, jharvard should have requested 1 or 2 cores (given that this was an interactive job on Open OnDemand, 2 cores are recommended).

Bookmarkable Links

1 Overview
2 Command
3 Jobstats Dashboard
4 Jobstats Emails
5 What should my job utilization be?

Last UpdatedDecember 5, 2025

Tags:

Search Docs by Keyword