Using SGE Job Arrays
Job arrays have many advantages, including reduced load on SGE, faster job submission, and easier job management. If you find yourself submitting thousands of jobs at a time you should use job arrays. However, the SGE documentation is somewhat lacking and arrays do make job submission more complicated. Below is a description of how SGE job arrays work:
Job arrays can be submitted from the command line with the -t option to qsub, e.g.,:
qsub -t 1-20:1 myjob.csh
This would submit 20 identical jobs with job indices from 1-20. If the ":1" was replaced with ":n" it would do every "n"th job.
If you want to use different input file for each job you need to reference $SGE_TASK_ID in your job execution, something like this in a perl script:
open(LIST,"jobFiles.list") or die "Error: $!\n";
my @labels;
while(<LIST>){
my $name = $_;
push @labels,$name;
}
my $whichFile = $ENV{SGE_TASK_ID} - 1;my $filename = $labels[$whichFile];
Here "jobFiles.list" is a file containing a list of input files and $filename is the input file for the job. This snippet of code is run during execution (not at submission). The qsub -t option in your job submission should reflect the actual number of files in your list.


