Monitoring Jobs
Monitoring Edison Batch Jobs
The batch system provides the command to monotor your jobs. We are listing the commands commonly used to submit and monitor the jobs. For more informaiton please refer to the man pages of these commands.
| Job Commands | |
|---|---|
| Command | Description |
| qsub batch_script | Submits batch script to the queue. The output of qsub will be a jobid |
| qdel jobid | Deletes a job from the queue |
| qhold jobid | Puts a job on hold in the queue. |
| qrls jobid | Releases a job from hold. |
| qalter [options] jobid | Change attributes of submitted job. (See below.) |
| qmove new_queue jobid | Move job to new queue. Remember, the new queue must be one of the submission queues (premium, regular, or low) |
| qstat -a | Lists jobs in submission order (more useful than qstat without options) Also takes -u and -f [jobid]> options |
| qstat -f jobid | Produce a detailed report for the job. |
| qs | NERSC provided wrapper that shows jobs in priority order. Takes -u username and -w options. |
| apstat | Shows the number of up nodes and idle nodes and a list of current pending and running jobs. apstat -r command displays all the nodes reservations. |
| showq | List jobs in priority order in three categories: active jobs, eligible jobs and blocked jobs. This command lists jobs in priority order. showq -i lists details of all eligible jobs. |
| showstart jobid | Takes a jobid as its argument and displays an earliest possible start time of such jobs that request the same amount of resources (nodes, walltime, memory, etc.) (Caution: jobs requesting same amount of resources will return same start time from this command. The estimated job start time is only accurate for the job with the highest priority among them). |
| checkjob jobid | Takes a jobid as its argument and displays the current job state and whether nodes are available to run the job currently. |
| xtnodestat [-j] [-m] | shows the current allocation and status of the system's nodes and gives information about each running job. The output displays the position of each node in the network. With -m print only the mesh display; with -j print only the job display |
Notes:
To alter requested resources for a currently queued (but not running) job use the qalter command. You can change the wallclock limit, the account to be charged, email options, the stdout/stderr paths, and the total number of cores needed or the number of cores per node (mppnppn), among other things. See the "qsub" man page for details. The two important restrictions are that you cannot change any attributes once your job begins running and you cannot change mppwidth so that the job moves across the execution queue boundaries. Usage examples:
edison02% qalter -lwalltime=new_walltime jobid edison02% qalter -lmppwidth=new_mppwidth jobid


