Queues and Policies
Jobs must be submitted to a valid submit queue. Upon submission the job is routed to the appropriate execution queue. Users can not directly access the execution queues.
- If a user reaches the run limit in a queue, their eligible limit becomes 0. That is, once at the run limit, no additional jobs will be considered for scheduling for that user in that queue. Jobs that are not eligible for scheduling will remain in the "blocked" state until the user drops below the run limit.
- The debug AND interactive queues are to be used for code development, testing, and debugging ONLY. Production runs are strictly prohibited from using the debug and/or interactive queues. User accounts are subject to suspension if they are determined to be using the debug and/or interactive queues for production computing. In particular, job "chaining" in the debug queue is not allowed. Chaining is defined as using a batch script to submit another batch script.
- The scavenger queue is available only to users with a zero or negative balance in one of their repositories. This applies to both total repository balances as well as per-user balances. The queue is not available for jobs submitted against a repository with a positive balance. If a user has multiple repositories, they should add the line "#PBS -A <repo>" to the jobscript in order to specify the repository against which a job is to be charged.
- There are 8 nodes reserved from 5am to 6pm Pacific Time, Mon-Fri, for interactive and debug jobs.
- There is a limit of 500 submitted jobs per user, per queue.
- There is a limit of 165 running jobs per user across the entire system; note that the serial queue is exempt from this limit.
- Jobs submitted to the reg_xlmem queue must specify a memory requirement. The requested memory must be at least 16GB per task, and no more than 960GB for the entire job. See Carver Memory Considerations for examples.
- Jobs running in the reg_long and/or reg_xlong queue can make it difficult to schedule system maintenance activities. While NERSC will make every effort to allow jobs to complete prior to maintenance events, there may be times when jobs in reg_long and reg_xlong must be terminated in order to perform critical maintenance.
- Any job that has been in the queue for 14 days or more, and is in the "user hold" state, will be removed from the system. Note that this means:
- Jobs may not be held for more than 14 days; and
- Jobs older than 14 days may not be held.
- See Carver Memory Considerations for additional information on requesting large-memory (48 GB) and extra-large-memory (1TB) nodes.