Submitting PDSF Jobs
Univa Grid Engine (UGE) is the batch system used at PDSF. This is a fork of the Sun Grid Engine (SGE).
PDSF batch jobs have a 1 day wallclock limit. If your job attempts to run beyond the wallclock limit UGE will kill it.
The total number of jobs (running, pending or otherwise) for all users is limited to 30,000 and the number of jobs a single user can have at any one time is limited to 5000. Since PDSF is a shared facility any jobs that are detrimental to the overall performance of the batch system are subject to being deleted at the discretion of the PDSF staff. If this happens, you will be notified and asked to adjust your workflow as necessary.
Because of security reasons, UGE will automatically strip out the LD_LIBRARY_PATH environment variable if a job is submitted with "qsub -V". This means that if you load a module on a login node and then submit a job with the "-V" option, the jobs will fail with library not found errors. Instead, you should load the modules in the script you are submitting to the batch system directly.
If you are planning to submit thousands of short jobs concurrently consider using job arrays if possible to reduce UGE accounting overhead. See Using job arrays.
|Action||How to do it||Comment|
|Submit a job||qsub script||You need to submit a script, not an executable. If you need your job to inherit all the environmental variables of the submitting shell you have to request it with the -V option. Note: your job will not inherit your LD_LIBRARY_PATH (even if you specify -V).|
|Exclude a node or nodes||qsub -l hostname=!<node>||This would exclude <node> from your jobs. You need to do this prior to job submission. See also "qrmnode -help" for more detailed information.|
|Submit a job with an IO resource requirement||qsub -l eliza<#>io=1 script||Replace <#> by the number associated with filesystem you are using. See IO Resources for more details.|
|Submit a job that accesses the NERSC Global Scratch file system||qsub -l gscratchio=1 script||Jobs that access global scratch must use the gscratch IO resource flag. This flag makes sure your job is routed to a node that has Global Scratch mounted. Without it your job may fail.|
|Show the available IO resources and their limits||qconf -se global||The total IO resources of all running jobs cannot exceed the limits shown.|
|Submit a job to the debug queue||qsub -l debug=1 script||The debug queue has only one node and has a one hour wall clock time limit.|
|Submit a job that depends on other jobs||qsub -hold_jid [job_ID|job_name] script||Wait until [job_ID|job_name] is finished before submitting your job. It only lets you "AND" job IDs/job names.|
|Submit a job to different project||qsub -P [project] script||By default your job runs as the project corresponding to your primary unix group. If get a message saying you do not have access to the project you specify you'll need to file a ticket to get added to it.|
|Get e-mail from your job upon completion||no e-mail by default, add the -m option of qsub to request e-mail||See man pages for details.|
|Specify default job requirements||put them in a file called .sge_request||Put the .sge_request file in your home directory to apply to all jobs you submit or in the directory you submit jobs from to apply only to jobs submitted from that directory.|
|Set the virtual memory limit||add "-l h_vmem=2G"||Default virtual memory limit is 1.1GB and your jobs will crash if you hit the limit. Note that this is a consumable resource so when the cluster is full the more memory you specify the longer it will take to schedule your jobs.|
|Combine stdout and stderr output||add "-j y"|
|Specify how much scratch space you need||add "-l scratchfree=1G"||
This would ensure that there was at least 1GB of free scratch space when your job starts.
|Run job in another chos||add "-v CHOS=[chos]"||Runs job in a different chos (by default jobs run in the chos you're in when you submit the job)|
|Use multiple cores for your job||add "-pe=single NN"||Request multiple cores for a single job. For example, if you are are running a multithreaded job set NN to the number of threads.|
Acessing File Systems
Batch jobs that will access data (either for reading or writing) on the elizas, project or global scratch must declare this when jobs are submitted. Please see the IO Resources page for more details. Jobs that are accessing these file systems but don't declare their IO resources can be deleted at the discretion of the PDSF staff. Jobs that are accessing global scratch may fail if the "-l gscratchio=1" argument is not included. Global scratch is only mounted on the newer compute nodes, including this flag makes sure your jobs are routed to the correct nodes.
Scratch Space Usage
On the compute nodes please use $TMPDIR to utilize an area set aside for scratch space work. This points to /scratch/<jobID>.<queuename>. The amount of space available varies depending on the compute node. If you need more than ~10GB of space, please add "-l scratchfree=XXG" to your job submission line.
Please do NOT use /tmp or /scratch directly. Jobs that use /tmp may be terminated and the user's access to the batch system blocked.
PDSF Batch Job Example
Here's an example of how to run a simple batch job, monitor it, check its output, and look at the UGE accounting information about it. We start with a simple script named hello.csh, which just sleeps a bit and then writes some output. Lines that start with "#$" are understood by the batch system, in this case we're asking to be assigned to a node with at least 2 GB of memory free.
pdsf4 72% cat hello.csh
#$ -l h_vmem=2G
echo "Hello, World"
We could have also specified the 2 GB of free memory request on the command line by saying "qsub -l h_vmem=2G hello.csh". For this example, since the request is already in the hello.csh file, we just use qsub without any options:
pdsf4 74% qsub hello.csh
Your job 1787239 ("hello.csh") has been submitted
We can check on its status with qstat. Use the -u option to get only your jobs:
pdsf4 75% qstat -u pdsfuser
job-ID prior name user state submit/start at queue slots ja-task-ID
1787239 0.00000 hello.csh pdsfuser qw 12/29/2010 09:56:01 1
Here we see the job is in the qw state, which means it is queued and waiting. The priority is zero but that's just because that reporting is turned off in the batch system. If we keep monitoring it eventually we see it in the r state, which means it is running:
pdsf4 76% qstat -u pdsfuser
job-ID prior name user state submit/start at queue slots ja-task-ID
1787239 0.27362 hello.csh pdsfuser r 12/29/2010 09:58:06 email@example.com 1
From the above qstats we can see that it was in the qw state for just over two minutes and is now running on pc1810. Eventually the jobs finishes and no longer is shown in qstat:
pdsf4 80% qstat -u pdsfuser
If we look in the directory where we submitted the job we see that output files were created:
pdsf4 81% ls -l
-rwxr-xr-x 1 pdsfuser rhstar 41 Dec 29 09:50 hello.csh
-rw-r--r-- 1 pdsfuser rhstar 0 Dec 29 09:58 hello.csh.e1787239
-rw-r--r-- 1 pdsfuser rhstar 13 Dec 29 10:08 hello.csh.o1787239
The file ending with e<job-ID> is the stderr and the file ending with o<job-ID> is the stdout. You can have stdout and stderr go into one file by specifying "-j y" in your qsub command. The stderr is empty in this example and the stdout is as expected:
pdsf4 83% cat hello.csh.o1787239
It's often useful to look at the SGE accounting information about your jobs with qacct:
pdsf4 84% qacct -o pdsfuser -j 1787239
qsub_time Wed Dec 29 09:56:01 2010
start_time Wed Dec 29 09:58:07 2010
end_time Wed Dec 29 10:08:09 2010