NERSCPowering Scientific Discovery Since 1974

Using Job Arrays

Job arrays have many advantages, including reduced load on the batch system, faster job submission, and easier job management. If you find yourself submitting thousands of jobs at a time you should use job arrays. However, the UGE documentation is somewhat lacking and arrays do make job submission more complicated. Below is a description of how UGE job arrays work:

Job arrays can be submitted from the command line with the -t option to qsub, e.g.,:

qsub -t 1-20:1 myjob.csh


This would submit 20 identical jobs with job indices from 1-20. If the ":1" was replaced with ":n" it would do every "n"th job.

If you want to use different input file for each job you need to reference $SGE_TASK_ID in your job execution, something like this in a perl script:

open(LIST,"jobFiles.list") or die "Error: $!\n";
my @labels;
while(<LIST>){
my $name = $_;
push @labels,$name;
}
my $whichFile = $ENV{$SGE_TASK_ID} - 1;
my $filename = $labels[$whichFile];


Here "jobFiles.list" is a file containing a list of input files and $filename is the input file for the job. This snippet of code is run during execution (not at submission). The qsub -t option in your job submission should reflect the actual number of files in your list.

If you wanted to have a different log file for each job in the array you can include this line at the top of the submitted script:

$ -o mylog_$TASK_ID