Submitting PDSF Jobs
Univa Grid Engine (UGE) is the batch system used at PDSF. This is a fork of the Sun Grid Engine (SGE).
PDSF batch jobs have a 1 day wallclock limit. If your job attempts to run beyond the wallclock limit UGE will kill it.
The total number of jobs (running, pending or otherwise) for all users is limited to 30,000 and the number of jobs a single user can have at any one time is limited to 5000. Since PDSF is a shared facility any jobs that are detrimental to the overall performance of the batch system are subject to being deleted at the discretion of the PDSF staff. If this happens, you will be notified and asked to adjust your workflow as necessary.
If you are planning to submit thousands of short jobs concurrently consider using job arrays if possible to reduce UGE accounting overhead. See Using job arrays.
|Action||How to do it||Comment|
|Submit a job||qsub script||You need to submit a script, not an executable. If you need your job to inherit all the environmental variables of the submitting shell you have to request it with the -V option. Note: your job will not inherit your LD_LIBRARY_PATH (even if you specify -V).|
|Exclude a node or nodes||qsub -l hostname=!<node>||This would exclude <node> from your jobs. You need to do this prior to job submission. See also "qrmnode -help" for more detailed information.|
|Submit a job with an io resource requirement||qsub -l eliza<#>io=1 script||Replace <#> by the number associated with filesystem you are using. See IO Resources for more details.|
|Show the available io resources and their limits||qconf -se global||The total io resources of all running jobs cannot exceed the limits shown.|
|Submit a job to the debug queue||qsub -l debug=1 script||The debug queue has only one node and has a one hour wall clock time limit.|
|Submit a job that depends on other jobs||qsub -hold_jid [job_ID|job_name] script||Wait until [job_ID|job_name] is finished before submitting your job, and it only lets you "AND" job IDs/job names.|
|Submit a job to different project||qsub -P [project] script||By default your job runs as the project corresponding to your primary unix group. If get a message saying you do not have access to the project you specify you'll need to file a ticket to get added to it.|
|Get e-mail from your job upon completion||no e-mail by default, add the -m option of qsub to request e-mail||see man pages for details|
|Specify default job requirements||put them in a file called .sge_request||Put the .sge_request file in your home directory to apply to all jobs you submit or in the directory you submit jobs from to apply only to jobs submitted from that directory.|
|Set the virtual memory limit||add "-l h_vmem=2G"||Default virtual memory limit is 1.1GB and your jobs will crash if you hit the limit. Note that this is a consumable resource so especially when the cluster is full the more memory you specify the harder it will be to schedule your jobs so you shouldn't set it higher than what you need.|
|Combine stdout and stderr output||add "-j y"|
|Specify how much scratch space you need||add "-l scratchfree=1G"||
This would ensure that there was at least 1GB of free scratch space when your job starts.
|Run job in another chos||add "-v CHOS=[chos]"||
Runs job in a different chos (default chos is the one you're in when you submit the job)