NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory
 

MPP Accounts and Charging

Introduction to MPP Hours

When a job runs on a NERSC MPP system, such as Franklin, Bassi, or Jacquard, charges accrue against one of the user's repository allocations. A parallel job is charged for exclusive use of each multi-core/CPU node allocated to the job. The unit of accounting for these systems at NERSC is "MPP Hours." The MPP charge for a given job is calculated as the product of:

  1. the job's wall-clock time in hours,
  2. the number of CPUs or cores allocated to the job (regardless of the number actually used),
  3. a machine charge factor (MCF) based on typical performance of the machine relative to the historical basis of the 375 MHz IBM Power3 architecture (MCF=1.0), and
  4. and a job priority factor.

Please note that starting in Allocation Year 2009 (AY09), the Machine Charge Factor (and MPP hour) will be based on the Cray XT4 hour (based on the performance of the current quad-core node, not the previous dual-core version).

Account information is available via the NIM web interface.

How Accounts (repos) are Charged

Usage can be accrued in two different ways:

  • Wallclock (or "connect") time for jobs run on the compute nodes (or via the batch system in the case of DaVinci).
  • CPU time for serial jobs run on the login nodes (or for all interactive jobs run on DaVinci). This includes user programs as well as system commands such as ls, vi, etc.

Charges for jobs run on compute nodes (or via the batch system)

The charge against your repo for parallel (or batch) jobs is
(wallclock hours used) 
        * (machine charge factor or MCF)
        * (number of nodes allocated) 
        * (processors or CPUs or cores per node) 
        * (Class Charge Priority Factor or CPF) 

The machine charge factor is:

  • For Allocation Year 2008:
    • 3.6 for Jacquard
    • 4.0 for DaVinci
    • 6.0 for Bassi
    • 6.5 for Franklin
  • For Allocation Year 2009:
    • 0.6 for Jacquard
    • 0.6 for DaVinci
    • 1.0 for Bassi
    • 1.0 for Franklin

The class priority charge factor is:

  • 1.0 for regular, debug and interactive jobs
  • 0.5 for low priority jobs
  • 2.0 for premium priority jobs

A 32-node regular priority job on Franklin that begins at 12:00:00 and ends 8 hours later at 20:00:00, will have a charge in MPP hours of:

	8 hours * 32 nodes * 2 Cores/node * 6.5 MCF * 1.0 priority = 3,328 MPP Hours 

Note that on Franklin a discount is applied to jobs that run on a large number of processors. See Batch Queues and Policies on Franklin.

MPP Hours are Based on Wall-clock Time for Allocated Resources

It doesn't matter if you don't use all the CPUs or cores on a node, you are charged for all of them as long as the node is allocated to your job. It also doesn't matter if some or all of your tasks spent 1:55 hours waiting at a barrier and used only 5 minutes of CPU time. You are charged for all the wall-clock time your parallel job is resident on multiple nodes.

Charges for jobs run on the login nodes (interactive jobs on DaVinci)

The charge against your default repo is:
(CPU hours used) * (machine charge factor or MCF)

Class Priority Charge Factors

NERSC has implemented priority scheduling classes to give users some control over how quickly their jobs are scheduled for execution in the batch system by designating them as one of premium, regular, or low.

Interactive and Debug Jobs

Interactive and debug jobs are charged at the same rate at the Regular rate described below. They have the highest scheduling priority.

Batch Priority Scheduling Classes

Three classes of batch scheduling are available:

  • Premium
  • Regular
  • Low

A premium job is scheduled for execution before an otherwise equivalent regular job. A low job has a lower priority for scheduling. These priority classes affect how quickly a job is scheduled for execution in the "wait queues"; it does not effect the UNIX priority at which the job executes.

Charging for Priority Batch Scheduling Classes

Priority scheduling classes have different charge rates. It is intended that most users, over the year, will run most of their jobs in the regular class. Users should use the premium class with care; no additional allocation is available to cover the extra charges associated with its use.

Rates:

  • Premium scheduling at an elevated charge rate (2.0)
  • Regular scheduling at the standard charge rate (1.0)
  • Low scheduling at a reduced charge rate (0.5)

On Bassi the priority charge classes correspond to the LoadLeveler classes of the same name. Jobs in the debug and interactive LoadLeveler classes are charged at the "regular" rate. If no priority class is specified, a rate of "regular" is used.

On the Linux Opteron cluster, Jacquard, the batch, debug, and interactive classes are charged at the regular rate. Low priority charging is also available. On the SGI Altix, DaVinci, the debug and batch classes are charged at the regular rate. Premium charging is not available on these platforms.

Determining your Priority Ratio

Although our accounting system does not currently report the time spent in each priority class, the NERSC Information Management system (NIM) shows your average charge factor (Avg CF) in its Account Summary display whenever you login (or select My Account Usage from the My Stuff pull-down menu).

Charging to a different repo

If you do not specify which account to charge, your charges will accrue to your default repo (unless that repo is out of time, see below).

On Bassi

To charge a batch job to a specific repo use the LoadLeveler keyword:

   #@ account_no = repo_name

Interactive jobs may be charged to a non-default repository by setting the environment variable LOADL_ACCOUNT_NO.

On Franklin, Jacquard and DaVinci

To charge a batch job to a specific repo use the PBS keyword:

   #PBS -A repo_name

Running Out of Time

When a user exhausts his repository allocation, NIM checks to see if the user has other repos to charge to. If the user does have other repos to charge to, he is prevented from charging to the repo where he has no time. If he has no other repos to charge to, his account is placed in a restricted state: he can no longer submit batch jobs.

Accounting records are updated once a day at about 4:00 AM Pacific Time. Repo balances are adjusted shortly afterward.

The following checks are made to determine whether the user has any valid repo to charge to and if so which repo to charge:

  1. At job submission time a filter runs to determine if the user is allowed to run batch jobs and if so what repo to charge:
    • If all the user's repos are out of time the submission fails.
    • If the user's specified repo is out of time then the submission fails.
    • If the user didn't specify a repo and the user's default repo is out of time and the user still has valid repos to charge to, then one of those valid repos will be charged. You can use NIM to see and change your default repo.
    • The repo chosen will be indicated in a message written to stdout.
  2. When the job is dispatched to run another check is made to see if the job is still allowed to run:
    • for batch jobs: does the repo determined in step 1 above still have time? If yes, the job can start running. If no, the job cannot.
    • for IBM POE jobs: does any repo have time? If yes, the job can start running; if no, the job cannot.
  3. Once a job has finished running the accounting process will make a final check on which repo to charge:
    • A batch job is charged to the repo determined in step 1 above.
    • An interactive (including IBM POE) job is charged to your default repo unless that repo is out if time. In the latter case, if the user has no other repos to charge to the job gets charged to the (negative) default repo; if the user has other valid repos to charge to, then the repo that gets charged is the first one with available time in the list returned by the getrepo utility. You can use NIM to change your default repo.
To request more time for a repo that is low (or out of) time, the PI or PI Proxy should send an email to the project's DOE Allocation Manager.

LBNL Home
Page last modified: Mon, 22 Sep 2008 17:17:55 GMT
Page URL: http://www.nersc.gov/nusers/accounts/mpp-charging.php
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science