Batch jobs are jobs that run non-interactively under the control of a "batch script," which is a text file containing a number of job directives and LINUX commands or utilities. Batch scripts are submitted to the "batch system," where they are queued awaiting free resources on Edison. The batch system on Edison is known as "Torque."
Bare-Bones Batch Script
The simplest Edison batch script will look something like this.
#PBS -q regular
#PBS -l mppwidth=48
#PBS -l walltime=00:10:00
aprun -n 48 ./my_executable
This example illustrates the basic parts of a script:
- Job directive lines begin with #PBS. These "Torque Directives" tell the batch system how many nodes to reserve for your job and how long to reserve those nodes. Directives can also specify things like what to name STDOUT files, what account to charge, whether to notify you by email when your job finishes, etc.
- $PBS_O_WORKDIR holds the path to the directory from which you submitted your job. While not required, most batch scripts have "cd $PBS_O_WORKDIR" as the first command after the directives.
- The aprun command is used to start execution of your code on Edison's compute nodes.
The following table lists recommended and useful Torque keywords. For an expanded list of Torque job options and keywords see the qsub documentation but keep in mind that this is describes a generic Torque implementation and not all options and environment variables are relevant to or defined on Edison (for example, $PBS_NODEFILE is not defined).
|Required Torque Options/Directives|
|-l mppwidth=nodes*cores_per_node||One node will be used.||Used to allocate nodes to your job. The number of nodes you'll get is the value of mppwidth divided by the number of cores per node (24 for Edison unless you use HyperThreading), plus 1 if there is a remainder from the division|
|-l walltime=HH:MM:SS||00:30:00||Always specify the maximum wallclock time for your job.|
|-q queue||debug||Always specify your queue, which will usually be debug for testing and regular for production runs. See "Queues and Policies" in the left-hand menu.|
|Useful Torque Options/Directives|
|-lmppnppn=MPI_tasks_per_node||24 (Edison)||Use MPI_tasks_per_node tasks per node (Cray specific)|
|-lmppdepth=threads_per_MPI_task||1||Run threads_per_MPI_task threads per node; use for OpenMP (Cray specific)|
|-N job_name||Job script name.||Job Name: up to 15 printable, non-whitespace characters.|
|-A mXXX||Your default repo||Charge this job to the NERSC repository mXXX (necessary only if you have more than one NERSC repo)|
|-e filename||<script_name>.e<job_id>||Write STDERR to filename|
|-o filename||<script_name>.o<job_id>||Write STDOUT to filename|
|-j [eo|oe]||Do not merge.||Merge STDOUT and STDERR. If oe merge as standard output; if eo merge as standard error.|
|-m [a|b|e|n]||a||E-mail notification options:
a = send mail when job aborted by system
b = send mail when job begins
e = send mail when job ends
n = do not send mail
Options a,b,e may be combined.
|-S shell||Login shell||Specify shell as the scripting language to use.|
|-V||Do not export.||Export the current environment variables into the batch job environment. NOTE: this option is not recommended by NERSC; it can make it difficult to reproduce results (including diagnosing job failures).|
All options may be specified as either qsub command-line options (see below) or as directives in the batch script as #PBS options. Note: if you use both, any command line options will override the corresponding options in the batch script.
Note: The pvmem option is not implemented on the Cray version of Torque. Jobs with a pvmem option will be queued, but they may never run.
The aprun Command
All codes that execute on Edison's compute nodes must be started with the "aprun" command. Without the aprun command, the code will run (if it runs at all) on the shared MOM node that executes your batch job commands. See Using aprun.
Submitting a Batch Script
Once you have a batch script you submit it to the system using the "qsub" command. For a script named "myscript.pbs" type
edison01% qsub myscript.pbs
from the directory that contains the script file. You can specify Torque directives as options to qsub but we recommend putting your directives in the script instead. Then you will have a record of the directives you used, which is useful for record-keeping as well as debugging should something go wrong.
Choosing a Batch Queue
You can choose from several batch queues for your job. The main purpose of having different queues is to control scheduling priorities and set limits on the numbers of jobs of different sizes. Different queues may have different charge rates. This somewhat complex queue structure strives to achieve an optimal balance among fairness, wait times, run times, and DOE strategic goals.
When you submit your batch job you will usually chose one of these queues
- regular : Use this for almost all your production runs.
- debug: Use this for small, short test runs
Additional queues are available. See Queues and Scheduling Policies.
Standard output (STDOUT) and standard error (STDERR) messages from your job are written to temporary files in your submit directory ($PBS_O_WORKDIR) and you can monitor them there during your run if you wish. IMPORTANT: Do not alter, remove, or rename these files while the job is running or your job may fail!
After the batch job completes, the above files will be renamed to the appropriate name (as you specified in your batch script of the Torque default naming convention; for example: jobscript.e164892 and jobscript.o164892).
Job Steps and Dependencies
There is a qsub option -W depend=dependency_list and a Torque Keyword #PBS -W depend=dependency_list for job dependencies. The most commonly used dependency_list would be afterok:jobid[:jobid...], which means the job just submitted will be executed only after the dependent job(s) terminated without an error. Another option would be afterany:jobid[:jobid...], which means the job just submitted will be executed only after the dependent job(s) terminated either with or without an error. The second option could be useful in many restart runs if it is the user's intention to exceed wall clock limit for the first job.
Note that the job id in the "-W depend=" line, could be in the format of a complete job (jobid@torque_server), such as 164894.edique02@edique02, or 164894.edique02, or just 164894.
For example, to run batch job2 only after batch job1 is completed:
edison02% qsub job1
edison02% qsub -W depend=afterok:164894 job2
edison02% qsub -W depend=afterany:164894 job2
Job steps and dependencies can be used in a workflow to prepare input data for simulation or to archive output data after a simulation. See the Job Steps and Dependencies example in Example Batch Scripts.