NERSCPowering Scientific Discovery Since 1974

Using Job Arrays on Carver

Job Arrays

Job arrays are a way to submit many jobs using only 1 batch submission script.  The term task arrays is often used in documentation for what are called job arrays below.

The behavior of the different jobs of the array can be controlled by the different values of the PBS_ARRAYID environment variable for each job in the array.

Job Array Example

This is an example of a job array that will run several different jobs on the Carver serial queue.

You can control the working directory, executable name, input file, output file, and other parameters of the individual jobs based on the value of $PBS_ARRAYID.

#PBS -l walltime=00:05:00
#PBS -N jacar
#PBS -o jacar.out
#PBS -e jacar.err
#PBS -q serial
#PBS -t 1-20

cd $PBS_O_WORKDIR

cd workdir.$PBS_ARRAYID
./job.$PBS_ARRAYID <input.$PBS_ARRAYID >output.$PBS_ARRAYID

This job will run 20 jobs on the serial queue, each with a different value of $PBS_ARRAYID ranging from 1 to 20. 

When you submit the job you will see a job id of this kind:

7636985[].cvrsvc09-ib

You can monitor the state of each job of the array with the qstat -t, showq, or qs commands.  The qstat command without the -t option will show only the master job job id[].

The job id for the individual jobs in the job array will show up as jobid[n] where n are the array id's specified by the -t argument to qsub.

> qs -u mstewart
 JOBID ST    USER    NAME    NDS     REQ      USED         SUBMIT          QUEUE  RANK
7636985[1] R mstewart  jacar-1    1  00:05:00         -  Dec 9 12:29:25      serial
7636985[3] R mstewart  jacar-3    1  00:05:00         -  Dec 9 12:29:25      serial
7636985[2] R mstewart  jacar-2    1  00:05:00         -  Dec 9 12:29:25      serial ....

You can use commands like qhold and qdel on all the jobs of the job array by using jobid[] as the argument to the command.  For individual jobs in the job array use jobid[n] as the argument the command where n is the array id of the job.

The standard output and error for each job in the job array will go to separate files with -n prefixes where -n is the array index.

In the above batch script, the standard output for the job n of the array will be written to the file jacar.out-n and the standard error to the file jacar.err-n.

Job Array Options

Instead of specifiying a range of task id's as the argument to the -t command, you can specify specific task id's separated by commas.  You can also specify both specific task id's and ranges in the same argument to -t, e.g.

#PBS -t 1,7,9-11

would run 5 jobs with $PBS_ARRAYID's of 1, 7, 9, 10, and 11.

Another run time option is the slot limit.  You can add an optional slot limit to the -t flag which will limit the amount of jobs that can run concurrently in the job array.  The  default value is unlimited. The slot limit must be the last thing specified in the array_request and is delimited from the array by a percent sign (%).

#PBS -t 1:20%5

This would run 20 jobs, but no more than 5 of them would be running at any one time.

You can also use qalter to modify the slot limits of an array.