Operation and Use, Volume 1 Using the Parallel Operating Environment
NAME
poe - Invokes the Parallel Operating Environment (POE) for
loading and executing programs on remote processor nodes.
SYNOPSIS
poe [-h] [program] [program_options]...
[-adapter_use adapter_specifier]
[-cpu_use cpu_specifier]
[-euidevice device_specifier]
[-euilib {ip | us}]
[-euilibpath path_specifier]
[{-hostfile | -hfile} host_file_name]
[-pmd_version {2 | 3}]
[-procs partition_size]
[-pulse interval]
[-resd {yes | no}]
[-retry retry_interval]
[-retrycount retry_count]
[-msg_api {MPI | LAPI | MPI,LAPI}]
[-rmpool pool_ID]
[-nodes number_of_nodes]
[-tasks_per_node number_of_tasks]
[-savehostfile output_file_name]
[-cmdfile commands_file]
[-llfile loadleveler_job_command_file_name]
[-newjob {yes | no}]
[-pgmmodel {spmd | mpmd}]
[-save_llfile output_file_name]
[-labelio {yes | no}]
[-stdinmode {all | none | task_ID}]
[-stdoutmode {unordered | ordered | task_ID}]
[{-infolevel | -ilevel} message_level]
[-pmdlog {yes | no}]
[-buffer_mem memory_size]
[-clock_source {AIX | SWITCH}]
[-css_interrupt {yes | no}]
[-eager_limit size_limit]
[-hints_filtered {yes | no}]
[-intrdelay delay_parameter]
[-max_typedepth maximum_depth]
[-ionodefile io_node_file_name]
[-shared_memory {yes | no}]
[-use_flow_control {yes | no}]
[-thread_stacksize stacksize]
[-single_thread {no | yes}]
[-wait_mode {poll | yield | sleep}]
[-polling_interval interval]
[-retransmit_interval interval]
[-io_buffer_size buffer_size]
[-io_errlog {yes | no}]
[-coredir directory_prefix_string | none]
[-corefile_format { lightweight_corefile_name | STDERR }]
[-euidevelop {yes | no | deb | min | nor}]
[-pmlights number_of_lights]
[-usrport port_ID] [fence_string additional_options]
The poe command invokes the Parallel Operating Environment for
loading and executing programs on remote processor nodes. The operation
of POE is influenced by a number of POE environment variables. The flag
options on this command are each used to temporarily override one of these
environment variables. User program_options can be freely
interspersed with the flag options. If no program is
specified, POE will either prompt you for programs to load, or, if the
MP_CMDFILE environment variable is set, will load the partition
using the specified commands file.
FLAGS
The -h flag, when used, must appear immediately after
poe, and causes the poe man page, if it exists, to be
printed to the screen.
The remaining flags you can specify on this command are used to temporarily
override POE environment variables. For more information on valid
values, and on what a particular flag sets, refer to the description of its
associated environment variable in the ENVIRONMENT VARIABLES section.
The following flags are grouped by function.
The following Partition Manager control flags override the associated
environment variables.
- -adapter_use
- MP_ADAPTER_USE
- -cpu_use
- MP_CPU_USE
- -euidevice
- MP_EUIDEVICE
- -euilib
- MP_EUILIB
- -euilibpath
- MP_EUILIBPATH
- -hostfile or -hfile
- MP_HOSTFILE
- -pmd_version
- MP_PMD_VERSION
- -procs
- MP_PROCS
- -pulse
- MP_PULSE
- -resd
- MP_RESD
- -retry
- MP_RETRY
- -retrycount
- MP_RETRYCOUNT
- -msg_api
- MP_MSG_API
- -rmpool
- MP_RMPOOL
- -nodes
- MP_NODES
- -tasks_per_node
- MP_TASKS_PER_NODE
- -savehostfile
- MP_SAVEHOSTFILE
The following Job Specification flags override the associated environment
variables.
- -cmdfile
- MP_CMDFILE
- -llfile
- MP_LLFILE
- -newjob
- MP_NEWJOB
- -pgmmodel
- MP_PGMMODEL
- -save_llfile
- MP_SAVE_LLFILE
The following I/O Control flags override the associated environment
variables.
- -labelio
- MP_LABELIO
- -stdinmode
- MP_STDINMODE
- -stdoutmode
- MP_STDOUTMODE
The following generation of diagnostic information flags override the
associated environment variables.
- -infolevel or -ilevel
- MP_INFOLEVEL
- -pmdlog
- MP_PMDLOG
The following Message Passing flags override the associated environment
variables.
- -buffer_mem
- MP_BUFFER_MEM
- -clock_source
- MP_CLOCK_SOURCE
- -css_interrupt
- MP_CSS_INTERRUPT
- -eager_limit
- MP_EAGER_LIMIT
- -hints_filtered
- MP_HINTS_FILTERED
- -intrdelay
- MP_INTRDELAY
- -max_typedepth
- MP_MAX_TYPEDEPTH
- -ionodefile
- MP_IONODEFILE
- -shared_memory
- MP_SHARED_MEMORY
- -use_flow_control
- MP_USE_FLOW_CONTROL
- -thread_stacksize
- MP_THREAD_STACKSIZE
- -single_thread
- MP_SINGLE_THREAD
- -wait_mode
- MP_WAIT_MODE
- -polling_interval
- MP_POLLING_INTERVAL
- -retransmit_interval
- MP_RETRANSMIT_INTERVAL
- -io_buffer_size
- MP_IO_BUFFER_SIZE
- -io_errlog
- MP_IO_ERRLOG
The following corefile generation flags override the associated
environment variables.
- -coredir
- MP_COREDIR
- -corefile_format
- MP_COREFILE_FORMAT
The following are miscellaneous flags:
- -euidevelop
- Overrides the MP_EUIDEVELOP environment variable.
- -pmlights
- Determines the number of lights displayed (per row) on the Program Marker
Array. This overrides the MP_PMLIGHTS environment
variable. For more information on the Program Marker Array, refer to
the manual page for the pmarray command.
- -usrport
- Overrides the MP_USRPORT environment variable.
DESCRIPTION
The poe command invokes the Parallel Operating Environment for
loading and executing programs on remote nodes. You can enter it at
your home node to:
- load and execute an SPMD program on all nodes of your partition.
- individually load the nodes of your partition with an MPMD job.
- load and execute a series of SPMD and MPMD programs, in individual job
steps, on the same partition.
- run non-parallel programs on remote nodes.
The operation of POE is influenced by a number of POE environment
variables. The flag options on this command are each used to
temporarily override one of these environment variables. User
program_options can be freely interspersed with the flag options, and
additional_options not to be parsed by POE can be placed after a
fence_string defined by the MP_FENCE environment
variable. If no program is specified, POE will either prompt
you for programs to load, or, if the MP_CMDFILE environment
variable is set, will load the partition using the specified commands
file.
The environment variables and flags that influence the operation of this
command fall into distinct categories of function. They are:
- Partition Manager control. The environment variables and
flags in this category determine the method of node allocation, message
passing mechanism, and the PULSE monitor function.
- Job specification. The environment variables and flags
in this category determine whether or not the Partition Manager should
maintain the partition for multiple job steps, whether commands should be read
from a file or STDIN, and how the partition should be loaded.
- I/O control. The environment variables and flags in this
category determine how I/O from the parallel tasks should be handled.
These environment variables and flags set the input and output modes, and
determine whether or not output is labeled by task id.
- Generation of diagnostic information. The environment
variables and flags in this category enable you to generate diagnostic
information that may be required by the IBM Support Center in resolving
PE-related problems.
- Message Passing Interface. The environment variables and
flags in this category enable you to specify values for tuning message passing
applications.
- Corefile generation. The environment variables and
flags in this category govern aspects of corefile generation including the
directory name into which corefiles will be saved, or the corefile format
(standard AIX or lightweight).
- Miscellaneous. The additional environment variables and
flags in this category enable additional error checking, and set a dispatch
priority class for execution.
ENVIRONMENT VARIABLES
The environment variable descriptions in this section are grouped by
function.
The following environment variables are associated with Partition Manager
control.
- MP_ADAPTER_USE
- Determines how the node's adapter should be used. The US
communication subsystem library does not require dedicated use of the SP
switch on the node. Adapter use will be defaulted, as in Table 4, but shared usage may be specified. Valid values are
dedicated and shared. If not set, the default is
dedicated for US jobs, or shared for IP jobs. The value of this
environment variable can be overridden using the -adapter_use
flag.
- MP_CPU_USE
- Determines how the node's CPUs should be used. The US
communication subsystem library does not require unique CPU use on the
node. CPU use will be defaulted, as in Table 4, but multiple use may be specified. Valid values are
multiple and unique. If not set, the default is
unique for US jobs, or multiple for IP jobs. The
value of this environment variable can be overridden using the
-cpu_use flag.
- MP_EUIDEVICE
- Determines the adapter set to use for message passing. Valid values
are en0 (for Ethernet), fi0 (for FDDI), tr0 (for
token-ring), css0 (for the SP system's high performance switch
feature), and csss (for the SP switch 2 high performance
adapter).
- MP_EUILIB
- Determines the communication subsystem library implementation to use for
communication - either the IP communication subsystem or the US
communication subsystem. In order to use the US communication
subsystem, you must have an SP system configured with its high performance
switch feature. Valid, case-sensitive, values are ip (for the
IP communication subsystem) or us (for the US communication
subsystem). The value of this environment variable can be overridden
using the -euilib flag.
- MP_EUILIBPATH
- Determines the path to the message passing and communication subsystem
libraries. This only needs to be set if an alternate library path is desired. Valid values are any path specifier. The value of this
environment variable can be overridden using the -euilibpath
flag.
- MP_HOSTFILE
- Determines the name of a host list file for node allocation. Valid
values are any file specifier. If not set, the default is
host.list in your current directory. The value of
this environment variable can be overridden using the -hostfile or
-hfile flags.
- MP_PMD_VERSION
- Determines which version of the partition manager daemon should be
used. It is intended for situations where you need to submit a parallel
job from a POE version 3 node to a set of nodes that have not yet been
upgraded to version 3. Valid values are 2 (use the version 2
partition manager daemon -- pmv2) and 3 (use the version 3
partition manager daemon -- pmv3). If not set, the default is to
use the version 3 partition manager daemon. You can override this
environment variable with the -pmd_version flag.
- MP_PROCS
- Determines the number of program tasks. Valid values are any number
from 1 to 4096. If not set, the default is 1. The value of this
environment variable can be overridden using the -procs flag.
- MP_PULSE
- The interval (in seconds) at which POE checks the remote nodes to ensure
that they are communicating with the home node. The default interval is
600 seconds (10 minutes). To disable the pulse function, specify an
interval of 0 (zero) seconds. The pulse function is automatically
disabled when running the pdbx debugger. You can override
the value of this environment variable with the -pulse flag.
- MP_REMOTEDIR
- Specifies the name of a script which echoes the name of the current
directory to be used on the remote nodes. By default, the current
directory is the current directory at the time that POE is run. You may
need to specify this if the AutoMount Daemon is used to mount user file
systems, and the user is not using the Korn shell.
The script mpamddir is provided for mapping the C shell
directory name to an AutoMount Daemon name.
- MP_RESD
- Determines whether or not the Partition Manager should connect to
LoadLeveler to allocate nodes. Valid values are either yes or
no, and there is no default. The value of this environment
variable can be overridden using the -resd flag.
- MP_RETRY
- Determines the period of time (in seconds) between processor node
allocation retries if there are not enough processor nodes immediately
available to run a program. This is only valid if you are using
LoadLeveler. Valid values are any integer greater than or equal to
0. The default is 0 (no retry). The value of this environment
variable can be overridden using the -retry flag.
- MP_RETRYCOUNT
- The number of times (at the interval set by MP_RETRY) that the
Partition Manager should attempt to allocate processor nodes. Valid
values are any integer greater than or equal to 0. If not set, the
default is 0. The value of this environment variable can be overridden
using the -retrycount flag.
- MP_MSG_API
- Indicates to POE which message passing API is being used by the parallel
tasks. You need to set this environment variable if a parallel task is
using LAPI alone or in conjunction with MPI. You do not need to set it
if a parallel task is using MPI only. The value of this environment
variable can be overridden using the -msg_api flag.
- MP_RMPOOL
- Determines the name or number of the pool that should be used for
non-specific node allocation. This environment variable/command-line
flag only applies to LoadLeveler. Valid values are any identifying pool
name or number. There is no default. The value of this
environment variable can be overridden using the -rmpool
flag.
- MP_NODES
- Specifies the number of physical nodes on which to run the parallel
tasks. It may be used alone or in conjunction with
MP_TASKS_PER_NODE and/or MP_PROCS, as described in Table 6. The value of this environment variable can be
overridden using the -nodes flag.
- MP_TASKS_PER_NODE
- Specifies the number of tasks to be run on each of the physical
nodes. It may be used in conjunction with MP_NODES and/or
MP_PROCS, as described in Table 6, but may not be used alone. The value of this
environment variable can be overridden using the -tasks_per_node
flag.
- MP_SAVEHOSTFILE
- The name of an output host list file to be generated by the Partition
Manager. Valid values are any relative or full path name. The
value of this environment variable can be overridden using the
-savehostfile flag.
- MP_TIMEOUT
- Controls the length of time POE waits before abandoning an attempt to
connect to the remote nodes. The default is 150 seconds.
MP_TIMEOUT also changes the length of time the communication
subsystem will wait for a connection to be established during message passing
initialization.
If the SP security method is "dce and compatibility", you may need
to increase the MP_TIMEOUT value to allow POE to wait for the DCE
servers to respond (or timeout if the servers are down).
- MP_CKPTFILE
- Defines the base name of the checkpoint file when checkpointing a
program. See Checkpointing and restarting programs for more information.
- MP_CKPTDIR
- Defines the directory where the checkpoint file will reside when
checkpointing a program. See Checkpointing and restarting programs for more information.
The following environment variables are associated with Job
Specification.
- MP_CMDFILE
- Determines the name of a POE commands file used to load the nodes of your
partition. If set, POE will read the commands file rather than
STDIN. Valid values are any file specifier. The value of this
environment variable can be overridden using the -cmdfile
flag.
- MP_LLFILE
- Determines the name of a LoadLeveler job command file for node
allocation. If you are performing specific node allocation, you can use
a LoadLeveler job command file in conjunction with a host list file. If
you do, the specific nodes listed in the host list file will be requested from
LoadLeveler. Valid values are any relative or full path name.
The value of this environment variable can be overridden using the
-llfile environment variable.
- MP_NEWJOB
- Determines whether or not the Partition Manager maintains your partition
for multiple job steps. Valid values are yes or
no. If not set, the default is no. The value
of this environment variable can be overridden using the -newjob
flag.
- MP_PGMMODEL
- Determines the programming model you are using. Valid values are
spmd or mpmd. If not set, the default is
spmd. The value of this environment variable can be overridden
using the -pgmmodel flag.
- MP_SAVE_LLFILE
- When using LoadLeveler for node allocation, the name of the output
LoadLeveler job command file to be generated by the Partition Manager.
The output LoadLeveler job command file will show the LoadLeveler settings
that result from the POE environment variables and/or command-line options for
the current invocation of POE. If you use the MP_SAVE_LLFILE
environment variable for a batch job, or when the MP_LLFILE
environment variable is set (indicating that a LoadLeveler job command file
should participate in node allocation), POE will show a warning and will not
save the output job command file. Valid values are any relative or full
path name. The value of this environment variable can be overridden
using the -save_llfile flag.
The following environment variables are associated with I/O Control.
- MP_LABELIO
- Determines whether or not output from the parallel tasks are labeled by
task id. Valid values are yes or no. If not
set, the default is no. The value of this environment variable
can be overridden using the -labelio flag.
- MP_STDINMODE
- Determines the input mode - how STDIN is managed for the parallel
tasks. Valid values are:
- all
- all tasks receive the same input data from STDIN.
- none
- no tasks receive input data from STDIN; STDIN will be used by the
home node only.
- n
- STDIN is only sent to the task identified (n).
If not set, the default is all. The value of this
environment variable can be overridden using the -stdinmode
flag.
- MP_HOLD_STDIN
- Determines whether or not sending of STDIN from the home node to the
remote nodes is deferred until the message passing partition has been
established. Valid values are yes or no. If
not set, the default is no.
- MP_STDOUTMODE
- Determines the output mode - how STDOUT is handled by the parallel
tasks. Valid values are:
- unordered
- all tasks write output data to STDOUT asynchronously.
- ordered
- output data from each parallel task is written to its own buffer.
Later, all buffers are flushed, in task order, to STDOUT.
- a task id
- only the task indicated writes output data to STDOUT.
If not set, the default is unordered. The value of this
environment variable can be overridden using the -stdoutmode
flag.
The following environment variables are associated with the generation of
diagnostic information.
- MP_INFOLEVEL
- Determines the level of message reporting. Valid values are:
- 0
- error
- 1
- warning and error
- 2
- informational, warning, and error
- 3
- informational, warning, and error. Also reports diagnostic messages
for use by the IBM Support Center.
- 4, 5, 6
- Informational, warning, and error. Also reports high- and low-level
diagnostic messages for use by the IBM Support Center.
If not set, the default is 1 (warning and error). The value
of this environment variable can be overridden using the -infolevel
or -ilevel flags.
- MP_PMDLOG
- Determines whether or not diagnostic messages should be logged to a file
in /tmp on each of the remote nodes. Typically, this
environment variable/command-line flag is only used under the direction of the
IBM Support Center in resolving a PE-related problem. Valid values are
yes or no. If not set, the default is
no. The value of this environment variable can be overridden
using the -pmdlog flag.
- MP_DEBUG_INITIAL_STOP
- Determines the initial breakpoint in the application where
pdbx will get control. MP_DEBUG_INITIAL_STOP should be
specified as file_name:line_number. The
line_number is the number of the line within the source file
file_name; where file_name has been compiled with
-g. The line number has to be one that defines executable
code. In general, this is a line of code for which the compiler
generates machine level code. Another way to view this is that the line
number is one for which debuggers will accept a breakpoint. Another
valid string for MP_DEBUG_INITIAL_STOP would be the
function_name of the desired initial stopping point in the
debugger. If this variable is not specified, the default is to stop at
the first executable source line in the main routine. This environment
variable has no associated command-line flag.
- MP_PMDSUFFIX
-
Determines a string to be appended to the Partition Manager daemon service, or
executable (when using LoadLeveler).
The PMD service in /etc/services is named
pmv3. By setting MP_PMDSUFFIX, you can append a
string to pmv3. If MP_PMDSUFFIX is set to
"abc", for example, the service requested in
/etc/services is pmv3abc.
When using LoadLeveler, the string is appended to the partition manager
daemon executable name, /etc/pmdv3. By setting
MP_PMDSUFFIX, you can append a string to pmdv3.
If MP_PMDSUFFIX is set to "abc", for example, the
partition manager daemon that gets run on each node is
/etc/pmdv3abc.
If the MP_PMD_VERSION environment variable (or
-pmd_version flag) is set to a value of 2, the
appropriate Partition Manager daemon service name and executable name is
appended to pmv2 and pmdv2 respectively.
This permits testing of alternate versions of the Partition Manager
daemon. Typically, this environment variable is used only under the
direction of the IBM Support Center in resolving a PE-related problem.
Valid values are any string. This environment variable has no
associated command-line flag.
The following environment variables are associated with the Message Passing
Interface.
- MP_BUFFER_MEM
- Changes the maximum size of memory used by the communication subsystem to
buffer early arrivals. The default is 2.8 megabytes for IP and
64 megabytes for US. However, if checkpointing a program, for US the
default will be 2.8 megabytes. If you are using this environment
variable to change the maximum size of memory used by the communication
subsystem while checkpointing a program, please be aware that the amount of
space needed for the checkpointing files will be increased by the value of
MP_BUFFER_MEM.
- MP_CLOCK_SOURCE
- Determines whether or not to use the SP switch clock as a time
source. Valid values are AIX and switch. There is no default value. The value of this
environment variable can be overridden using the -clock_source
flag.
- MP_CSS_INTERRUPT
- Determines whether or not arriving message packets cause
interrupts. This may provide better performance for certain
applications. Valid values are yes and no.
If not set, the default is no.
- MP_EAGER_LIMIT
- Changes the threshold value for message size, above which rendezvous
protocol is used.
- MP_HINTS_FILTERED
- Determines whether MPI info objects reject hints (key/value pairs) which
are not meaningful to the MPI implementation. In filtered mode, an
MPI_INFO_SET call which provides a key/value pair that the
implementation does not understand will behave as a no-op. A subsequent
MPI_INFO_GET call will find that the hint does not exist in the
info object.
In unfiltered mode, any key/value pair is stored and may be
retrieved. Applications which wish to use MPI info objects to cache and
retrieve key/value pairs other than those actually understood by the MPI
implementation must use unfiltered mode. The option has no effect on
the way MPI uses the hints it does understand. In unfiltered mode,
there is no way for a program to discover which hints are valid to MPI and
which are simply being carried as uninterpreted key/value pairs.
Providing an unrecognized hint is not an error in either mode.
Valid values for this environment variable are yes and
no. If set to yes, unrecognized hints are be
filtered. If set to no, they will not. If this
environment variable is not set, the default is yes. The
value of this environment variable can be overridden using the
-hints_filtered command-line flag.
- MP_INTRDELAY
- Allows user programs to tune the interruptdelay parameter without having to recompile existing
applications.
- MP_MAX_TYPEDEPTH
- Changes the maximum depth of user-defined message data types.
- MP_IONODEFILE
- The name of a parallel I/O node file -- a text file that lists the
nodes that should be handling parallel I/O. This enables you to limit
the number of nodes that participate in parallel I/O, guarantee that all I/O
operations are performed on the same node, and so on. Valid values are
any relative or full path name. If not specified, all nodes will
participate in parallel I/O operations. The value of this environment
variable can be overridden using the -ionodefile command-line
flag.
- MP_SHARED_MEMORY
- Determines whether or not tasks running on the same node should use shared
memory (instead of the SP switch) for message passing. Valid values are
yes and no. If not set, the default is
no. The value of this environment variable can be overridden
using the -shared_memory flag.
- MP_SYNC_ON_CONNECT
- Determines whether or not the internal synchronization of MPI
initialization is disabled, thereby reducing the amount of network
traffic. Valid values are yes and no. If
not set, the default is yes.
-
-
- MP_USE_FLOW_CONTROL
- Applies flow control to MPI programs so that a sender can't outpace a
receiver.
- MP_THREAD_STACKSIZE
- Determines the additional stacksize allocated for user programs executing
on an MPI service thread. If you allocate insufficient space, the
program may encounter a SIGSEGV exception.
- MP_SINGLE_THREAD
- Avoids mutex lock overheads in a single threaded program. This is
an optimization flag, with values of no and yes.
The default value is no, which means multiple user message passing
threads are assumed.
- Note:
- MPI-IO cannot be used if this is set to YES. Results are
undefined if this is YES, with multiple message passing threads in
use.
- MP_WAIT_MODE
- Determines how a thread or task behaves when it discovers it is blocked,
waiting for a message to arrive. Values are poll,
yield, sleep, and nopoll. The default mode for the signal handling library is
poll for US, and sleep for IP (except on SMP
nodes). The default for IP on an SMP node is poll if the
number of tasks (for this job) on the node does not exceed the number of
processors on the node. If the number of tasks does exceed the number
of processors, the default is sleep.
- MP_POLLING_INTERVAL
- Defines the polling interval in microseconds. The maximum interval
is approximately 2 billion microseconds (2000 seconds). The default is
180,000 microseconds for IP, and 400,000 microseconds for US.
- MP_RETRANSMIT_INTERVAL
- MP_RETRANSMIT_INTERVAL=nnnnn and its command line
equivalent, -retransmit_interval=nnnnn, control how
often the communication subsystem library checks to see if it should
retransmit packets that have not been acknowledged. The value
nnnnn is the number of polling loops between checks. The
acceptable range is 1000 to 400000. The default is 10,000 for UDP and
400,000 for User Space.
- MP_IO_BUFFER_SIZE
- Indicates the default size of the data buffer used by MPI-IO
agents. For example:
export MP_IO_BUFFER_SIZE=16M
sets the default size of the MPI-IO data buffer to 16MB. The
default value of the environment variable is the number of bytes corresponding
to 16 file blocks. This value depends on the block size associated with
the file system storing the file. Valid values are any positive size up
to 128MB. The size can be expressed as a number of bytes, as a number
of KB (1024 bytes), using the letter k as a suffix, or as a number of
MB (1024 * 1024 bytes), using the letter m as a
suffix.
- MP_IO_ERRLOG
- Indicates whether to turn on error logging for I/O operations. For
example:
export MP_IO_ERRLOG=yes
turns on error logging. When an error occurs, a line of information
will be logged into file
/tmp/mpi_io_errdump.app_name.userid.taskid,
recording the time the error occurs, the POSIX file system call involved, the
file descriptor, and the returned error number.
The following are corefile generation environment variables:
- MP_COREDIR
- Creates a separate directory for each task's core file.
The value of this environment variable can be overridden using the
-coredir flag. A value of "none" signifies to bypass
creating a new directory resulting in core files written to /tmp.
- MP_COREFILE_FORMAT
- Determines the format of corefiles generated when processes terminate
abnormally. If not set, POE will generate standard AIX
corefiles. If set to the string "STDERR", output will go to
standard error. If set to any other string, POE will generate a
lightweight corefile (conforming to the Parallel Tool consortium's
Standardized Lightweight Corefile Format) for each process in your
partition. The string you specify is the name you want to assign to
each lightweight corefile. By default, these lightweight corefiles will
be saved to subdirectories prefixed by the string coredir and
suffixed by the task id (as in coredir.0,
coredir.1, and so on). You can specify a prefix other
than the default coredir by setting the MP_COREDIR
environment variable. The value of this environment variable can be
overridden using the -corefile_format flag.
The following are miscellaneous environment variables:
- MP_EUIDEVELOP
- Determines whether or not the message passing interface performs more
detailed checking during execution. This additional checking is
intended for developing applications, and can significantly slow
performance. Valid values are yes or no, deb
(for "debug"), nor (for "normal"), and min (for
"minimum"). The debug and min values are used to
enable and disable parameter checking. If not set, the default is
no. The value of this environment variable can be overridden
using the -euidevelop flag.
- MP_FENCE
- Determines a fence_string to be used for separating options you
want parsed by POE from those you do not. Valid values are any string,
and there is no default. Once set, you can then use the
fence_string followed by additional_options on the
poe command line. The additional_options will not be
parsed by POE. This environment variable has no associated command-line
flag.
- MP_NOARGLIST
- Determines whether or not POE ignores the argument list. Valid
values are yes and no. If set to yes, POE
will not attempt to remove POE command-line flags before passing the argument
list to the user's program. This environment variable has no
associated command-line flag.
- MP_PMLIGHTS
- Indicates the number of lights displayed per row on the Program Marker
Array.
- MP_PRIORITY
- Determines a dispatch priority adjustment class for execution. See
IBM Parallel Environment for AIX: Installation for more
information on dispatch priority classes. Valid values are any of the
dispatch priority classes set up by the system administrator in the file
/etc/poe.priority. This environment variable has no
associated command-line flag.
- MP_USRPORT
- Indicates the port id used by the Partition Manager to connect to the
Program Marker Array. By default, the Partition Manager connects to the
Array using a socket assigned to port 9999. If you get an error message
indicating that the port is in use, specify a different port. Standard
TCP/IP practice suggests using ports greater than 5000 and less than
10000.
EXAMPLES
- Assume the MP_PGMMODEL environment variable is set to
spmd, and MP_PROCS is set to 6. To load
and execute the SPMD program sample on the six remote nodes of your
partition, enter:
poe sample
- Assume you have an MPMD application consisting of two programs -
master and workers. These programs are designed to
run together and communicate via calls to message passing subroutines.
The program master is designed to run on one processor node.
The workers program is designed to run as separate tasks on any
number of other nodes. The MP_PGMMODEL environment variable
is set to mpmd, and MP_PROCS is set to 6.
To individually load the six remote nodes with your MPMD application,
enter:
poe
Once the partition is established, the poe command responds with
the prompt:
0:host1_name>
To load the master program as task 0 on host1_name, enter:
master
The poe command responds with a prompt for the next node to
load. When you have loaded the last node of your partition, the
poe command displays the message Partition
loaded... and begins execution.
- Assume you want to run three SPMD programs - setup,
computation, and cleanup - as job steps on the same
partition of nodes. The MP_PGMMODEL environment variable is
set to spmd, and MP_NEWJOB is set to yes.
You enter:
poe
Once the partition is established, the poe command responds with
the prompt:
Enter program name (or quit):
To load the program setup, enter:
setup
The program setup executes on all nodes of your partition. When
execution completes, the poe command again prompts you for a program
name. Enter the program names in turn. To release the partition,
enter:
quit
- To check the process status (using the non-parallel command ps)
for all remote nodes in your partition, enter:
poe ps
FILES
host.list (Default host list file)
RELATED INFORMATION
Commands: mpcc(1), mpcc_r(1), mpCC(1),
mpCC_r(1), mpxlf(1), mpxlf_r(1), pdbx(1),
pmarray(1), xprofiler(1)
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]