Running Jobs |
Historical: Running on Seaborg
|
| resource | hard limit |
|---|---|
| memory | 128 Megabytes |
| cpu time | 3600 seconds |
| max processes | 512 |
You cannot run interactive serial jobs on any of the "compute" nodes.
Interactive parallel programs are executed by the Parallel Operating Environment (POE) software. POE can run the job only on the "compute" nodes. Parallel jobs do not run on the login nodes that are used for interactive terminal sessions and interactive serial jobs.
Interactive jobs are run through the LoadLeveler job scheduling software in the interactive class. Therefore interactive jobs are subject to class resource limits and policies associated with the interactive class.
To run a parallel job interactively:
NOTE: Use of the poe command is optional for programs compiled with one of the "parallel" compiler invocations. Those programs will execute just as if you had typed poe at the command line. In fact, there is no way to run an executable compiled with an "mp" compiler as a "serial" program.
Options can be passed to poe on the command line or by setting environment variables. Command line options override the environment variable settings.
Interactive jobs are charged to your default repository, unless you specify otherwise. See Seaborg Accounts & Charging.
You should specify two of the following:
| POE command line option | POE Environment variable | Default | Description |
|---|---|---|---|
| -procs | MP_PROCS | 1 | The number of program tasks. |
| -nodes | MP_NODES | 1* | Specifies the number of physical nodes on which to run the parallel tasks. |
| -tasks_per_node | MP_TASKS_PER_NODE | 1* | Specifies the number of tasks to be run on each of the physical nodes. |
If you choose not to run 16 tasks per node, LoadLeveler will allocate tasks to the nodes in a balanced fashion.
There are a number of other available flags available. Two of the most useful flags are -retry N -retrycount M. These options specify an attempt to launch your parallel job should be made M times, with wait of N seconds between launch attempts. This is a good way of running an interactive job when the machine is busy.
Note that these are not the LoadLeveler keywords that are used in batch scripts, even though some of the names may be similar or even identical. Command-line arguments that poe does not recognize are passed to your program as arguments without warning or comment by poe.
For example, to run a parallel job with 32 total tasks on 2 "compute" nodes, with 10 attempts to run the job, waiting 30 seconds between attempts, you would type:
seaborg% poe ./a.out -procs 32 -nodes 2 -retry 30 -retrycount 10
![]() |
Page last modified: Tue, 22 Apr 2008 17:18:30 GMT Page URL: http://www.nersc.gov/nusers/systems/SP/old_stuff/running_jobs/interactive.php Web contact: webmaster@nersc.gov Computing questions: consult@nersc.gov Privacy and Security Notice |
![]() |