IBM Books

MPI Programming Guide


Chapter 11. POE environment variables and command-line flags

This appendix contains tables that summarize the POE environment

variables and command-line flags that are discussed throughout this book. You can set these variables and flags to influence the execution of parallel programs and the operation of certain tools. A command-line flag temporarily overrides its associated environment variable. The tables divide the variables and flags by function:

You can use the POE command-line flags on the pdbx and poe commands. You can also use some of the following flags on program names when individually loading nodes from STDIN or a POE commands file. The flags you can use are mainly those having to do with parallel trace collection. They are:

In the tables that follow, a check mark (X) denotes those flags you can use when individually loading nodes. For more information on individually loading nodes, see IBM Parallel Environment for AIX: Operation and Use, Volume 1.

Table 7. Variables and flags for partition manager control

Environment variable Command-line flag Set: Possible values Default

MP_ADAPTER_USE

-adapter_use

How the node's adapter should be used. The US communication subsystem library does not require dedicated use of the SP Switch on the node. Adapter use will be defaulted, but shared use may be specified. See IBM Parallel Environment for AIX: Operation and Use, Volume 1for more information. One of the following strings:

dedicated
Only a single program task can use the adapter.

shared
A number of tasks on the node can use the adapter.

For IP jobs: shared

For US jobs: dedicated

MP_CPU_USE

-cpu_use

How the node's CPUs should be used. The US communication subsystem library does not require unique CPU use on the node. CPU use will be defaulted, but multiple use may be specified. See IBM Parallel Environment for AIX: Operation and Use, Volume 1for more information. One of the following strings:

multiple
Your program can share the node with other users.

unique
Only your program's tasks can use the node.

For IP jobs: multiple

For US jobs: unique

MP_EUIDEVICE

-euidevice

The adapter set to use for message passing: Ethernet, Fiber Distributed Data Interface (FDDI), SP Switch, SP Switch2, or token ring. One of the following strings:

css0
SP Switch

csss
SP Switch2

en0
Ethernet

fi0
FDDI

tr0
token ring
The adapter set that is used as the external network address.

MP_EUILIB

-euilib

The communication subsystem library implementation to use for communication: either the IP communication subsystem or the User Space (US) communication subsystem. Programs that use LAPI must set MP_EUILIB (or -euilib) to us. In order to use the US communication subsystem, you must have an SP system configured with its SP Switch feature. One of the following strings:

ip
The IP communication subsystem.

us
The US communication subsystem.
Note:
This specification is case-sensitive.
ip

MP_EUILIBPATH

-euilibpath

The path to the message passing and communication subsystem libraries. This only needs to be set if the libraries are moved, or an alternate set is being used. Any path specifier. /usr/lpp/ppe.poe/lib

MP_HOSTFILE

-hostfile or -hfile

The name of a host list file for node allocation. Any file specifier or the word NULL. host.list in the current directory.

MP_NODES

-nodes

To specify the number of processor nodes on which to run the parallel tasks. It can be used alone or in conjunction with MP_PROCS or MP_TASKS_PER_NODE or both. See IBM Parallel Environment for AIX: Operation and Use, Volume 1 for more information. Any number from 1 to the maximum supported configuration. None

MP_PMD_VERSION

-pmd_version

The version of the partition manager daemon to be used for parallel jobs.

2
The Parallel Environment version 2 partition manager daemon (pmv2).

3
The Parallel Environment version 3 partition manager daemon (pmv3).

3
The Parallel Environment version 3 partition manager daemon (pmv3).

MP_PROCS

-procs

The number of program tasks. Any number from 1 to the maximum supported configuration. 1

MP_PULSE

-pulse

The interval (in seconds) at which POE checks the remote nodes to ensure that they are actively communicating with the home node.
Note:
Pulse is ignored for pdbx.
An integer greater than or equal to 0.

0 turns the pulse off.

600

MP_REMOTEDIR

(no associated command-line flag)

The name of a script which echoes the name of the current directory to be used on the remote nodes. Any file specifier. None

MP_RESD

-resd

Whether or not the partition manager should connect to LoadLeveler to allocate nodes.
Note:
When running POE from a workstation that is external to the LoadLeveler cluster, the LoadL.so fileset must be installed on the external node (see Using and Administering LoadLeveler and IBM Parallel Environment for AIX: Installation for more information).
yes
no

None

MP_RETRY

-retry

The period of time (in seconds) between processor node allocation retries if there are not enough processor nodes immediately available to run a program. This is only valid if you are using LoadLeveler. An integer greater than or equal to 0. 0 (no retry)

MP_RETRYCOUNT

-retrycount

The number of times (at the interval set by MP_RETRY) that the partition manager should attempt to allocate processor nodes. An integer greater than or equal to 0. 0

MP_RMPOOL

-rmpool

The name or number of the pool that should be used for non-specific node allocation. This environment variable/command-line flag only applies to LoadLeveler. An identifying pool name or number. None
MP_SAVEHOSTFILE -savehostfile The name of an output host list file to be generated by the partition manager. Any relative or full path name. None

MP_TASKS_PER_NODE

-tasks_per_node

To specify the number of tasks to be run on each of the physical nodes. It may be used in conjunction with MP_NODES or MP_PROCS or both, , but may not be used alone. See IBM Parallel Environment for AIX: Operation and Use, Volume 1for more information. Any number from 1 to the maximum supported configuration. None

Table 8. Variables and flags for job specification

Environment variable Command-line flag Set: Possible values Default

MP_CKPTDIR

-ckptdir

The directory where the checkpoint file will reside. See IBM Parallel Environment for AIX: Operation and Use, Volume 1 for more information. Any path specifier The directory from which POE is run

MP_CKPTFILE

-ckptfile

The base name of the checkpoint file. See IBM Parallel Environment for AIX: Operation and Use, Volume 1 for more information. Any file specifier None

MP_CMDFILE

-cmdfile

The name of a POE commands file used to load the nodes of your partition. If set, POE will read the commands file rather than STDIN. Any file specifier None

MP_LLFILE

-llfile

The name of a LoadLeveler job command file for node allocation. If you are performing specific node allocation, you can use a LoadLeveler job command file in conjunction with a host list file. If you do, the specific nodes listed in the host list file will be requested from LoadLeveler. Any path specifier None

MP_NEWJOB

-newjob

Whether or not the partition manager maintains your partition for multiple job steps. yes
no

no

MP_PGMMODEL

-pgmmodel

The programming model you are using. mpmd
spmd

spmd

MP_SAVE_LLFILE

-save_llfile

When using LoadLeveler for node allocation, the name of the output LoadLeveler job command file to be generated by the partition manager. The output LoadLeveler job command file will show the LoadLeveler settings that result from the POE environment variables and/or command-line options for the current invocation of POE. If you use the MP_SAVE_LLFILE environment variable for a batch job, or when the MP_LLFILE environment variable is set (indicating that a LoadLeveler job command file should participate in node allocation), POE will show a warning and will not save the output job command file. Any relative or full path name. None


Table 9. Variables and flags for I/O control

Environment variable Command-line flag Set: Possible values Default

MP_HOLD_STDIN

(no associated command-line flag)

Whether or not sending of STDIN from the home node to the remote nodes is deferred until the message passing partition has been established. yes
no

no

MP_LABELIO

-labelio

Whether or not output from the parallel tasks is labeled by task id. yes
no

no
yes (for pdbx)

MP_STDINMODE

-stdinmode

The input mode. This determines how input is managed for the parallel tasks.

all
All tasks receive the same input data from STDIN.

none
No tasks receive input data from STDIN; STDIN will be used by the home node only.

task-id
STDIN is only sent to the task specified by task-id.

all

MP_STDOUTMODE

-stdoutmode

The output mode. This determines how STDOUT is handled by the parallel tasks. One of the following:

ordered
Output data from each parallel task is written to its own buffer. Later, all buffers are flushed, in task order, to STDOUT.

unordered
All tasks write output data to STDOUT asynchronously.

task-id
Only the task specified by task-id writes output data to STDOUT.
unordered

Table 10. Variables and flags for diagnostic information

Environment variable Command-line flag Set: Possible values Default
Note:
The checkmark symbol (X) denotes flags you can use when individually loading nodes.

MP_DEBUG_INITIAL_STOP

(no associated command-line flag)

The initial breakpoint in the application where pdbx will get control. One of the following:
"filename":line_number
function_name
The first executable source line in the main routine.

MP_INFOLEVEL

-infolevel X

-ilevel X

The level of message reporting. One of the following integers:

0
Error.

1
Warning and error.

2
Informational, warning, and error.

3
Informational, warning, and error. Also reports high-level diagnostic messages for use by the IBM Support Center.

4
Informational, warning, and error. Also reports high- and low-level diagnostic messages for use by the IBM Support Center.

5
Informational, warning, and error. Also reports high- and low-level diagnostic messages for use by the IBM Support Center.

6
Informational, warning, and error. Also reports high- and low-level diagnostic messages for use by the IBM Support Center.
1

MP_PMDLOG

-pmdlog

Whether or not diagnostic messages should be logged to a file in /tmp on each of the remote nodes. Typically, this environment variable or command-line flag is only used under the direction of the IBM Support Center in resolving a PE-related problem. yes
no

no

MP_PMDSUFFIX

(no associated command-line flag)

Determines a string to be appended to the partition manager daemon service, or executable (when using LoadLeveler).

The PMD service in /etc/services is named pmv3. By setting MP_PMDSUFFIX, you can append a string to pmv3. If MP_PMDSUFFIX is set to abc, for example, the service requested in /etc/services is pmv3abc.

When using LoadLeveler, the string is appended to the partition manager daemon executable name, /etc/pmdv3. By setting MP_PMDSUFFIX, you can append a string to pmdv3. If MP_PMDSUFFIX is set to abc, for example, the partition manager daemon that gets run on each node is /etc/pmdv3abc.

If the MP_PMD_VERSION environment variable (or -pmd_version flag) is set to a value of 2, the appropriate partition manager daemon service name and executable name is appended to pmv2 and pmdv2 respectively.

This permits testing of alternate versions of the partition manager daemon. Typically, this environment variable is used only under the direction of the IBM Support Center in resolving a PE-related problem.

Any string None

Table 11. Variables and flags for message passing

Environment variable Command-line flag Set: Possible values Default

MP_BUFFER_MEM

-buffer_mem

To change the maximum size of memory used by

the communication subsystem to buffer early arrivals.

nnnnn
nnnK (where:
K = 1024 bytes)
nnM (where:
M = 1024*1024 bytes)
64MB (US)
2 800 000 bytes (IP)

MP_CLOCK_SOURCE

-clock_source

To use the SP Switch clock as a time source.

See Using the SP Switch clock as a time source for more information.

AIX
SWITCH

None. See Table 2 for more information.

MP_CSS_INTERRUPT

-css_interrupt

To specify whether or not arriving packets generate interrupts.

Using this environment variable may provide better performance for certain applications. Setting this variable explicitly will suppress the MPI-directed switching of interrupt mode, leaving the user in control for the rest of the run. For more information, see MPI_FILE_OPEN in IBM Parallel Environment for AIX: MPI Subroutine Reference.

yes
no

no

MP_EAGER_LIMIT

-eager_limit

To change the threshold value for message size,

above which rendezvous protocol is used.

To ensure that at least 32 messages can be outstanding between any two tasks, MP_EAGER_LIMIT will be adjusted based on the number of tasks according to the table in the Default column, when the user has specified neither MP_BUFFER_MEM, nor MP_EAGER_LIMIT, nor MP_USE_FLOW_CONTROL.

An integer less than or equal to 64K, in one of these formats:
nnnnn
nnnK (where:
K = 1024 bytes)
Number        Default
of Tasks        Value
---------------------
  1 to  16       4096
 17 to  32       2048
 33 to  64       1024
 65 to 128        512
129 to 256        256
257 to the        128
    maximum
    number of tasks
    supported by the
    implementation

MP_HINTS_FILTERED

-hints_filtered

To specify whether or not MPI info objects reject hints

(key and value pairs) that are not meaningful to the MPI implementation.

yes
no

yes

MP_IO_BUFFER_SIZE

-io_buffer_size

To specify the default size of the data buffer used by MPI-IO agents.

An integer less than or equal to 128M, in one of these formats:
nnnnn
nnnK (where:
K = 1024 bytes)
nnnM (where:
M = 1024*1024 bytes)
The number of bytes that corresponds to 16 file blocks.

MP_IO_ERRLOG

-io_errlog

To specify whether or not to turn on I/O error logging.

yes
no
no

MP_IONODEFILE

-ionodefile

To specify the name of a parallel I/O node file -- a text

file that lists the nodes that should be handling parallel I/O. Setting this variable enables you to limit the number of nodes that participate in parallel I/O or to guarantee that all I/O operations are performed on the same node. See IBM Parallel Environment for AIX: Operation and Use, Volume 1 for more information.

Any relative path name or full path name. None. All nodes will participate in parallel I/O.

MP_INTRDELAY

-intrdelay

To tune the delay parameter without having to

recompile existing applications (in microseconds).

An integer greater than or equal to 0 1

MP_MAX_TYPEDEPTH

-max_typedepth

To change the maximum depth

of message derived datatypes.

An integer greater than or equal to 1 5

MP_MSG_API

-msg_api

To indicate to POE which message-passing API is being used by the parallel tasks. You need to set this environment variable if a parallel task is using LAPI alone or in conjunction with MPI. You do not need to set it if a parallel task is using MPI only. MPI LAPI or MPI,LAPI MPI

MP_PIPE_SIZE

-pipe_size

To use the selected pipe size

(16KB, 32KB, or 64KB) as the transmission buffer to communicate between tasks on a job. This is only effective if you are using the SP Switch2 and user space jobs.

16, 32, or 64 64

MP_POLLING_INTERVAL

-polling_interval

To change the polling interval

(in microseconds).

An integer between 1 and 2 billion 180 000 (IP)
400 000 (US)

MP_RETRANSMIT_INTERVAL

-retransmit_interval

To control how often the communication subsystem library checks

to see if it should retransmit packets that have not been acknowledged. The value nnnn is the number of polling loops between checks.

An integer between 1000 and 400 000 10 000 (IP)
400 000 (US)

MP_SHARED_MEMORY

-shared_memory

To specify the use of shared memory (instead of IP or the SP Switch)

for message passing between tasks running on the same node.

yes
no

no

MP_SINGLE_THREAD

-single_thread

To avoid lock overheads in a program

that is known to be single-threaded. Neither MPI-IO nor MPI 1-sided communication can be used if this variable is set to yes. Results are undefined if this variable is set to yes with multiple message threads in use. See Using MPI_INIT or MPI_INIT_THREAD for more information.

yes
no

no

MP_SYNC_ON_CONNECT

(no associated command-line flag)

To disable the internal synchronization of MPI initialization,

thereby reducing the amount of network traffic. Applications that involve a large number of tasks may benefit from setting this environment variable to no.

MP_SYNC_ON_CONNECT should only be set to no when it is certain that the network hardware and software are sound, and the application involves a large number of tasks that might otherwise flood the network.

yes
no

yes


or, if MP_PROCS > 128:
no

MP_THREAD_STACKSIZE

-thread_stacksize

To specify the additional stack size allocated for user subroutines

running on an MPI service thread. If you do not allocate enough space, the program may encounter a SIGSEGV exception or more subtle failures.

nnnnn
nnnK (where:
K = 1024 bytes)
nnM (where:
M = 1024*1024 bytes)
0

MP_TIMEOUT

(no associated command-line flag)

To change the length of time (in seconds) the communication

subsystem will wait for a connection to be established during message-passing initialization.

Any number greater than 0. If set to 0 or a negative number, the value is ignored. 150

MP_USE_FLOW_CONTROL

-use_flow_control

To throttle the sender before the number of outstanding eager send

messages can overflow the early arrival buffer at a destination. Flow control insures that programs with weak synchronization and aggressive use of small messages will never overflow early arrival buffers.

When flow control is on, the setting of this variable affects the default value of MP_EAGER_LIMIT for 16 or more tasks, if the user has specified neither MP_BUFFER_MEM nor MP_EAGER_LIMIT.

yes
no

yes

MP_WAIT_MODE

-wait_mode

To specify how a thread or task behaves when it discovers it is

blocked, waiting for a message to arrive.

nopoll
poll
sleep
yield

poll (for US)

poll (for IP on SMP nodes, if the number of tasks for this job on the node does not exceed the number of processors on the node)

otherwise: sleep


Table 12. Variables and flags for core file generation

Environment variable Command-line flag Set: Possible values Default

MP_COREDIR

-coredir


To create a separate directory for each task's core file. Any valid directory name, or none to bypass creating a new directory. coredir.taskid

MP_COREFILE_FORMAT

-corefile_format

The format of core files generated when processes terminate abnormally. The string STDERR (to specify that the lightweight core file information should be written to standard error) or any other string (to specify the lightweight core file name). Standard AIX core files are generated if this variable (or flag) is not set or specified.

Table 13. Other variables and flags

Environment variable Command-line flag Set: Possible values Default
MP_DBXPROMPTMOD

(no associated command-line flag)

A modified dbx prompt. The dbx prompt \n(dbx) is used by the pdbx command as an indicator denoting that a dbx subcommand has completed. This environment variable modifies that prompt. Any value assigned to it will have a "." prepended and will then be inserted in the \n(dbx) prompt between the "x" and the ")". This environment variable is useful when the string \n(dbx) is present in the output of the program being debugged. Any string. None

MP_EUIDEVELOP

-euidevelop

Whether or not the message passing interface performs more detailed checking during execution. This additional checking is intended for developing applications, and can significantly slow performance. You can also enable and disable parameter checking with deb (for "debug") and min (for "minimum").
  • deb (for "debug")
  • min (for "minimum")
  • no or nor (for "normal")
  • yes (for "develop")

no

MP_FENCE

(no associated command-line flag)

A "fence" character string for separating arguments you want parsed by POE from those you do not. Any string. None

MP_NOARGLIST

(no associated command-line flag)

Whether or not POE ignores the argument list. If set to yes, POE will not attempt to remove POE command-line flags before passing the argument list to the user's program. yes
no

no

MP_PMLIGHTS

-pmlights

The number of lights displayed (per row) on the program marker array. An integer greater than or equal to 0. 0

MP_PRIORITY

(no associated command-line flag)

A dispatch priority class for execution. See IBM Parallel Environment for AIX: Installation for more information on dispatch priority classes. Any of the dispatch priority classes set up by the system administrator in the file /etc/poe.priority. None

MP_USRPORT

-usrport

The port ID used by the partition manager to connect to the program marker array. Any positive integer less than 32 767. Standard TCP/IP practice suggests using ports greater than 5000 and less than 10 000. 9999


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]