NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory
 

Running on Bassi

Job Launch Overview: POE and LoadLeveler

Bassi uses two software packages to run parallel programs: the Parallel Operating Environment (POE) executes parallel programs and LoadLeveler schedules jobs. Users can interact with this IBM software in a number of different ways and at a number of different levels. This can be very confusing, so a brief discussion of POE and LoadLeveler follow.

Parallel Operating Environment (POE)

POE is used to run parallel programs. This product augments the basic AIX operating system with software needed to run parallel programs. The command poe (all lower case) executes parallel programs. However, the poe command is not explicitly required to run a parallel program, depending on which options were used to compile an executable. POE recognizes environment variables and poe command line flags that specify how a parallel program should run. Please see "Operation and Use" in the IBM Manuals for more on POE.

LoadLeveler

LoadLeveler is used in addition to POE in order to run parallel jobs. Loadleveler is a "job management system" that is used to schedule all parallel jobs, regardless of whether the jobs are batch or interactive. More information on Loadleveler can be found on the IBM Batch page.

When running in job in batch mode, a user submits to LoadLeveler a script that contains commands and LoadLeveler keywords. The value of the LoadLeveler keywords determines how the code executes (e.g. number of nodes used, number of tasks, etc.)

You control how your parallel job executes by specifying

  1. LoadLeveler keyword values (batch mode), and/or
  2. values passed to POE on the command line, and/or
  3. environment variables

In batch mode you should completely specify how your job should run using LoadLeveler keywords exclusively, if possible. NERSC recommends that you be as explicit as possible in your specifications in order to avoid confusion.

In interactive mode poe command-line options override environment variable settings.

Avoid confusion! POE vs. LoadLeveler keywords and options

It is important to make the distinction between LoadLeveler keywords and poe command line options. They do not have the same names in general. For example, node is a LoadLeveler keyword, but is not a poe command-line option. The poe option is called nodes and is not a LoadLeveler keyword. total_tasks is a LoadLeveler keyword, but not a poe command-line switch. Therefore poe will completely ignore -total_tasks on the command line without warning or comment. For example, the following will run 4 tasks, rather than 8 tasks as might be expected:

 % poe ./a.out -nodes 4 -total_tasks 8 (does not work as expected!)

Because the default value of the MP_TASKS_PER_NODE POE environment variable is 1, this command line will run 1 task on each of 4 nodes, and ignore the total_tasks specification on the command line because it is not a valid poe command line option. See Interactive jobs.


LBNL Home
Page last modified: Wed, 19 Oct 2005 22:02:18 GMT
Page URL: http://www.nersc.gov/nusers/systems/bassi/running_jobs/overview.php
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science