NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory

Historical: Class Info and Policies for Seaborg
Seaborg Decommissioned January 2008

Classes and Job Scheduling

All Loadleveler jobs must be submitted to a valid submit class. If the class doesn't exist, no error message will be issued. The job will be submitted, but will sit in the queue indefinitely. If this happens, delete the job using the llcancel command and resubmit to an available class.

NERSC users specify one of the following submit classes to queue jobs. Upon submission the job is routed to the appropriate LoadLeveler class according to the following criteria. (Users can not directly access the LoadLeveler classes.)

Submit Class Job Type LL Class1 Nodes Max Wallclock Relative Priority3 Class Charge Factor
interactive parallel interactive 1-8 30 mins 1 1
debug parallel debug 1-24 30 mins 2 1
premium parallel premium 1-299 24 hrs 3 2
regular parallel reg_1 1-31 12 hrs 5 1
  parallel reg_1l 1-31 24 hrs 5 1
  parallel reg_32 32-47 48 hrs 5 1
  parallel reg_48 48-255 48 hrs 4 .5 see note4
  parallel reg_256 256-299 48 hrs 4 .5 see note4
  parallel reg_3006 300-ALL 48 hrs see note6 .5 see note4
  serial2 reg_s 1 processor 12 hrs see note5 see note7
low parallel low 1-299 12 hrs 6 .5
xfer serial xfer 1 processor 4 hrs see note5 see note7

Notes

1 - Jobs must be submitted to the "Submit Class," not the "LL Class".

2 - Serial jobs must be submitted with a job_type of "serial" and a class of "regular". See Running Serial Jobs with Loadleveler for more information on the serial job classes.

3 - The priorites listed in the table are relative. NERSC assigns priorities in terms of "equivalent days waiting in the queue". For example, a premium job on Seaborg receives a two day boost over the largest regular class jobs. You can view typical class and queue wait times on the job stats webpage. In addition to the relative priority given to jobs depending on their LoadLeveler class, users with large allocations who must have a certain turn around in order to use their allocation may be eligible to receive a scheduling "boost".

4 - Jobs that run in the reg_48, reg_256, and reg_300 classes are discounted 50% of the regular rate.

5 - The transfer (xfer) class runs on a dedicated node, and thus does not compete with the other classes.

6 - The reg_300 class will usually have a run limit of zero. NERSC staff will monitor this class and make special arrangements to run jobs of this size.

7 - The serial and xfer jobs are only charged for running on 1 processor, not the entire node.

See also Queue Policies for information on run limits and Using large-memory nodes for information on specifying memory requirements.


You can use the llclass command on the system to obtain information about the LoadLeveler classes. Detailed information about a single LoadLeveler class can be found using llclass -l classname.

If you request more wall clock time than allowed by the class (as indicated by the Max Time column in the table above), your job will be submitted with the wall clock time adjusted to the maximum allowed. If you omit requested time, then a default of 30 minutes will be used.

You must explicitly specify the number of nodes with the stanza "#@ node = ", otherwise your job will be run on one node.

Your job will be charged and scheduled according to the priority listed in the class name. Both interactive and debug are charged at the regular rate. See Accounting on the SP.

The classes are configured to give the best service to premium and regular jobs. Users are expected to use the regular class for at least 80 percent of their jobs. Premium jobs are charged at twice the rate of regular jobs, but are scheduled at a higher priority.

Loadleveler uses a scheduling technique called "backfilling". This method starts smaller, shorter jobs if they will not affect the start time for the job that is scheduled to begin next. This scheduling technique is advantageous from both a user and system perspective. It allows a faster turn around for shorter jobs, and it maximizes system usage.

Running Serial Jobs with Loadleveler

There are two LoadLeveler serial classes on Seaborg, serial and xfer. Jobs run in these classes are charged only for one processor's actual wall clock time. These classes are designed for

  • preprocessing data needed by large parallel runs
  • postprocessing data produced by large parallel runs
  • transferring data between Seaborg, HPSS and NERSC servers (such as the visualization server)
  • transferring data between the user's home site and NERSC.

These two classes typically have very short queue wait times, usually less than 10 minutes, so if you submit a job to one of these queues as the last statement of a large parallel job, you can expect the serial job to start almost immediately.

These two classes are not designed for single node multi-threaded jobs, e.g. OpenMP. Jobs of this type should be submitted with these LoadLeveler keywords:

#@ job_type = parallel
#@ node            = 1

Serial Job Class

This class is for serial pre-and-post data processing. The jobs in this class share a single 16 GB node. There is a run limit of 16 jobs that can share the node at the same time. It has wall clock and CPU limits of 12 hours and a memory limit of 1 GigaByte.

You can make use of the serial class by specifying the following LoadLeveler keywords:
#@ job_type = serial
#@ class = regular

xfer Job Class

This class is intended specifically for data transfer jobs. The run limit for the class is 8 jobs sharing a node especially configured for good network access to HPSS. It has wall clock and CPU limits of 4 hours and a memory limit of 1 GigaByte. (It takes approximately 3 hours for 1 terabyte of data to be transferred to HPSS from Seaborg).

Note you can submit a xfer job from within your parallel batch script. The example below shows how a parallel job may submit an xfer job to move files. The serial data transfer job is initiated when the XCPU signal is sent (soft_limit) or the parallel computational program completes.

This class is specifically intended only for data transfer and not for any other type of serial job. There is a discussion with examples of using HPSS in a batch job at Accessing HPSS - Batch Jobs.

You can make use of the xfer class by specifying the following LoadLeveler keywords:

#@ job_type = serial
#@ class = xfer

Interactive/Debug Schedule

  • From 5AM PST to 6PM PST 16 nodes from the compute pool are reserved for debug and interactive use only. This should allow good turnaround time for debug and interactive work. Note that due to limitations in Loadleveler, these nodes can not be guaranteed to be available at 5AM. NERSC will make the best effort possible to meet the 5AM availability.
  • Independent of these reserved nodes, interactive and debug jobs may run on any nodes throughout the cluster as they become available.

NERSC Queue Policies for Seaborg

  • For the production batch classes, each user may have:
    • 3 jobs running (this parameter can be adjusted depending on system load).
    • 4 jobs in Idle state (jobs queued to run; this parameter can be adjusted depending on system load).
    If you have 4 jobs queued (in Idle state) and need to run an Interactive or Debug job, place one of your jobs on User Hold: llhold jobid. To requeue the job: llhold -r jobid.
  • The combined number of debug and interactive jobs that a user may have submitted or running at a given time must be two or fewer. Note that this policy only applies to jobs run in the interactive and debug batch classes. This includes parallel jobs (anything compiled with one of the "mp" compilers, e.g. mpxlf90) that are executed from the command line, as well as those jobs (parallel or otherwise) that are explicitly submitted to these two classes with the "llsubmit" command. The policy has NO effect on sequential programs executed from the command line, including all the normal Unix commands.
  • The class run limit for reg_1l (regular jobs using 1 to 31 nodes and requesting more than 12 wall hours) is 15 jobs running. There are no other class run limits.
  • Any job that has been in the queue for 7 days or more, and is in the "user hold" (Loadlever status HU) state, will be removed from the system. Note that this means:
    • Jobs may not be held for more than 7 days; and
    • Jobs older than 7 days may not be held.
  • A 60 minute time limit is enforced on all user processes on the login nodes.
  • Seaborg is occassionally removed from service for scheduled maintenance. Users will be given seven days notice before such events, usually on the "Message of the Day" (MOTD), which is displayed upon login and is also available here. The handling of the production workload depends on the nature of the downtime:
    • In most cases, NERSC will checkpoint all running jobs before the maintenance period, and restart them after the maintenance is complete. Jobs that cannot be checkpointed will be killed.
    • In rare cases where the maintenance involves updates to certain runtime components, restarting from checkpoint files is not possible. In this case, 48 hours before the scheduled downtime LoadLeveler will be configured to only start jobs that will complete before the downtime, based on jobs' requested wall clock time limits. As the downtime draws nearer, shorter jobs will be started. Thus for two days before a scheduled downtime, jobs will start in an order that does not necessarily reflect their submit order.

LBNL Home
Page last modified: Mon, 11 Jan 2010 21:43:20 GMT
Page URL: http://www.nersc.gov/nusers/systems/SP/old_stuff/running_jobs/classes.php
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science