Submitting Jobs
Submitting your job
If you are submitting your job on Genepool or Phoebe you do NOT need to source any batch settings. The batch environment has been loaded into your path by default. If qsub is not working properly, check to make sure that the uge module is loaded:
module load uge
If you are submitting a job from an external submit host you need to source the appropriate settings.sh file for Genepool or Phoebe.
source /opt/uge/genepool/uge/genepool/common/settings.sh
qsub commands and options
UGE (Univa Grid Engine) is the batch system used for Genepool/Phoebe.
| Action | How to do it | Comment |
|---|---|---|
| Submit a job | qsub script | In UGE you need to submit a script, not an executable. |
| Specify number of processors for a threaded job | qsub -pe pe_slots 8 ... | Request 8 cores on a single node for your job. Please specify as many processors as will be needed during your job. |
| Specify number of nodes and processors for an MPI job | qsub -pe pe_8 16 ... | Request 2 nodes with 8 processors per node. pe_1, pe_2, pe_4, pe_8, pe_16, and pe_32 are available. |
| Specify memory required per processor | qsub -l ram.c=4G ... | Specify how much memory is required per processor for your job. At present this is implemented by implicity setting h_vmem (a virtual memory limit), so you will need to account for all virtual memory needed by your application. Use of a program like memtime during your benchmarking ahead of production may be informative. |
| Specify a time limit for your job | qsub -l h_rt=6:00:00 ... | Specifies that your job will run for at most 6 hours. Default is 12 hours. If you request more than 12 hours, your job will enter the long queue, which has much fewer dedicated resources. |
| Submit a job to the high priority queue | qsub -l high.c script | The high.c complex is for small fast turn around jobs |
| Submit a job that depends on other jobs | qsub -hold_jid [job_ID|job_name] script | UGE just recognizes whether or not [job_ID|job_name] is finished before submitting your job. The newly submitted job will only start once all jobs in the hold_jid list are completed. |
| Submit a job to different project | qsub -P [project] script | By default your job runs as the project corresponding to your primary NERSC project repo. If qsub indicates you do not have access to the project you specify please file a ticket to get added to it. |
| Get e-mail from your job upon completion | qsub -m e -M <email address> ... | No email by default. UGE can also email at the beginning of a job with "-m b", or upon errors with "-m a". |
| Execute the job in the current directory or specify a directory. |
qsub -cwd ... |
By default UGE will write output relative to your home directory. Since generating output files in your home directory from genepool can strain the home directory filesystem, please specify a working directory in your $BSCRATCH space. |
| Combine stdout and stderr output in one file | qsub -j y -o <filename> ... | By default a stdout and stderr file are written separately for your job. Specifying "-j y" will join the two (WARNING: this can often make debugging more difficult!). Specifying "-o <filename>" will redirect stdout to a filename of your choosing. |
| Send current environment to job | qsub -V ... | Send your current environment to the compute node, and execute your job with that environment, including any loaded modules. It is recommended to avoid -V, and instead have the batch script load needed modules; this improves reproducibility. |
| Send specific environment variables | add "-v <variable>[=value][,...]" | Defines or redefines the environment variable(s) to be exported to the execution context of the job |
| Specify validation level | add "-w {e|w|n|p|v}" | e[rror], w[arning], n[one], p[oke], v[erify]. Default is 'none'. |
Resource Limits
All of the following resources are requested with the -l flag in qsub. Consumable resources are a tool used by SGE to make sure that use of fixed resources, such as memory, is scheduled properly. Once all of the memory on the cluster has been allocated to existing jobs, jobs in the queue will be delayed until more memory is freed up. The queue the job is submitted to and the execution time requested are additional factors used to determine when a job is executed. It is important to understand how much memory and execution time your job needs. The more time and memory you request, the longer it will take for the job to start.
| Action | How to do it | Comment |
|---|---|---|
| User Requestable Queue | add "-l <queue-type>.c" | User requestable queues are high.c for the high priority queue and workflow.c for the new workflow queue (implemented soon). |
| Memory usage | add "-l ram.c=10G" | Default is 5.25 GB per slot. |
| Set runtime to N hours, hard limit | add "-l h_rt=HH:MM:SS" | Specifies a hard limit on the execution time (in hours) for the job. The default is 12 hours. Job will be killed if this time is exceeded. The shorter the requested time, the more likely it is the job will run sooner through the back filling mechanism. |
|
Set runtime to N hours, soft limit |
add "-l s_rt=HH:MM:SS" | Specifies a soft limit on the execution time (in hours) for the job. USR1 signal is sent. Signal can be trapped with a script to log necessary information |


