NERSCPowering Scientific Discovery Since 1974

Monitoring Jobs

Monitoring Hopper Batch Jobs

See the man pages for more options.  The Job Information page has more information on current queue status, completed jobs, ALPS logs and job summary statistics.

Job Commands
CommandDescription
qsub batch_script Submits batch script to the queue. The output of qsub will be a jobid
qdel jobid Deletes a job from the queue
qhold jobid

Puts a job on hold in the queue.
To delete a job from the hopper xfer queue users must add an additional parameter @hopper06 Example:6004861.hopper06@hopper06

qrls jobid Releases a job from hold.
qalter [options] jobid Change attributes of submitted job. (See below.)
qmove new_queue jobid Move job to new queue.  Remember, the new queue must be one of the submission queues (premium, regular, or low)
qstat -a Lists jobs in submission order (more useful than qstat without options) Also takes -u and -f [jobid]> options
qstat -f jobid Produce a detailed report for the job. Note: if used on the login node from which the job was submitted then jobid need only contain the numerical portion of the job id. If used on a different login node then jobid must contain the full id, such as qstat -f 100095.sdb (Hopper)or qstat -f 700432.nid00003 (Franklin).
qs NERSC provided wrapper that shows jobs in priority order. Takes -u username and -w options.
apstat Shows the number of up nodes and idle nodes and a list of current pending and running jobs. apstat -r command displays all the nodes reservations.
showq List jobs in priority order in three categories: active jobs, eligible jobs and blocked jobs. This command lists jobs in priority order. showq -i lists details of all eligible jobs.
showstart jobid Takes a jobid as its argument and displays an earlist possible start time of such jobs that request the same amount of resources (nodes, walltime, memory, etc).  (Caution: jobs requesting same amount of resources will return same start time from this command. The estimated job start time is only accurate for the job with the highest priority among them).
checkjob jobid Takes a jobid as its argument and displays the current job state and whether nodes are available to run the job currently.
xtnodestat [-j] [-m] shows the current allocation and status of the system's nodes and gives information about each running job. The output displays the position of each node in the network.  With -m  print only the mesh display; with -j print only the job display

Notes:

To alter requested resources for a currently queued (but not running) job use the qalter command.  You can change the wallclock limit, the account to be charged, email options, the stdout/stderr paths, and the total number of cores needed or the number of cores per node (mppnppn), among other things.  See the "qsub" man page for details. The two important restrictions are that you cannot change any attributes once your job begins running and you cannot change mppwidth so that the job moves across the execution queue boundaries.  Usage examples:

hopper% qalter -lwalltime=new_walltime jobid
hopper% qalter -lmppwidth=new_mppwidth jobid