IBM Books

MPI Subroutine Reference

MPI_FILE_OPEN, MPI_File_open

Purpose

Opens a file.

C synopsis

#include <mpi.h>
int MPI_File_open (MPI_Comm comm,char *filename,int amode,
		   MPI_Info info, MPI_File *fh);

C++ synopsis

#include mpi.h
static MPI::File MPI::File::Open(const MPI::Intracomm& comm, const char* filename, 
			         int amode, const MPI::Info& info);

FORTRAN synopsis

include 'mpif.h' or use mpi
MPI_FILE_OPEN(INTEGER COMM,CHARACTER FILENAME(*),INTEGER AMODE,
    INTEGER INFO,INTEGER FH,INTEGER IERROR)

Parameters

comm
is the communicator (handle) (IN)

filename
is the name of the file to open (string) (IN)

amode
is the file access mode (integer) (IN)

info
is the info object (handle) (IN)

fh
is the new file handle (handle) (OUT)

IERROR
is the FORTRAN return code. It is always the last argument.

Description

MPI_FILE_OPEN opens the file referred to by filename, sets the default view on the file, and sets the access mode amode. MPI_FILE_OPEN returns a file handle fh used for all subsequent operations on the file. The file handle fh remains valid until the file is closed (MPI_FILE_CLOSE). The default view is similar to a linear byte stream in the native representation starting at file offset 0. You can call MPI_FILE_SET_VIEW to set a different view of the file. Though most I/O can be done with the default file view, much of the optimization MPI-IO can provide depends on the effective use of appropriate user-defined file views.

MPI_FILE_OPEN is a collective operation. comm must be a valid intracommunicator. Values specified for amode by all participating tasks must be identical. The program is erroneous when participating tasks do not refer to the same file through their own instances of filename.

The following access modes (specified in amode), are supported:

MPI_MODE_APPEND - set initial position of all file pointers to end of file
MPI_MODE_CREATE - create the file if it does not exist
MPI_MODE_DELETE_ON_CLOSE - delete file on close
MPI_MODE_EXCL - raise an error if the file already exists and MPI_MODE_CREATE is specified
MPI_MODE_RDONLY - read only
MPI_MODE_RDWR - reading and writing
MPI_MODE_SEQUENTIAL - file will only be accessed sequentially
MPI_MODE_UNIQUE_OPEN - file will not be concurrently opened elsewhere
MPI_MODE_WRONLY - write only

MPI_MODE_UNIQUE_OPEN allows PE MPI-IO to use an optimization that is not possible when a file may be shared by other jobs. The optimization is more likely to help with read than with write performance. If it is known that the file will not be shared, MPI_MODE_UNIQUE_OPEN is worth trying.

In C and C++: You can use bit vector OR to combine these integer constants.

In FORTRAN: You can use the bit vector IOR intrinsic to combine these integers. If addition is used, each constant should only appear once.

File hints can be associated with a file when it is being opened. MPI_FILE_OPEN ignores the hint value if it is not valid. Any info key, value pair the user provides will either be accepted or ignored. There will never be an error returned or change in semantic as a result of a hint.

File Hints

Table 3 lists the supported file hints or info keys. There are restrictions on which file hints can be used simultaneously, and on when and under what circumstances a hint value can be set or used. In general, if a hint is specified in a circumstance where it is not supported, it will be ignored. Use the MPI_FILE_GET_INFO routine to verify the set of hints in effect for a file.

Table 3. Supported file hints

Hint name Description
filename
  • Default value: The file name specified by MPI_FILE_OPEN.
  • Valid values: Not applicable
  • Subroutines you can use to set it: This hint cannot be set with an info object. The hint value is taken from the file name specified by the filename parameter of the MPI_FILE_OPEN subroutine.
  • Value consistency requirement: Not applicable
  • Notes: This hint can only be retrieved when the MPI_FILE_GET_INFO subroutine is called.

file_perm
  • Default value: 644 if specified by MPI_FILE_OPEN with a mode of MPI_MODE_CREATE; otherwise, the value reflects the access permissions associated with the file.
  • Valid values: Octal values 000 through 777
  • Subroutines you can use to set it: MPI_FILE_OPEN
  • Value consistency requirement: Consistent values are required at all participating tasks
  • Notes:

    This hint can be specified in the info object when calling MPI_FILE_OPEN with the mode MPI_MODE_CREATE enabled in order to set the access permissions of the file to be created.

    This hint can also be retrieved when the MPI_FILE_GET_INFO subroutine is called, and its value then represents the access permissions associated with the file.

    The hint value is expressed as a three-digit octal number, similar to the format used by the numeric mode of the chmod shell command. The value is the sum of the following values:

    400
    permits read by owner
    200
    permits write by owner
    100
    permits execute by owner
    040
    permits read by group
    020
    permits write by group
    010
    permits execute by group
    004
    permits read by others
    002
    permits write by others
    001
    permits execute by others

IBM_io_buffer_size
  • Default value: number of bytes corresponding to 16 file blocks
  • Valid values: any positive value up to 128 MB. The size can be expressed either as a number of bytes, or as a number of kilobytes (KB), using the letter K or k as the suffix, or as a number of megabytes (MB), using the letter M or m as the suffix
  • Subroutines you can use to set it: MPI_FILE_OPEN, or, if there is no pending I/O operation: MPI_FILE_SET_INFO or MPI_FILE_SET_VIEW
  • Value consistency requirement: Consistent values are required at all participating tasks
  • Notes: This hint specifies the size that is used to stripe the file across I/O agents in round-robin style. In general, one I/O agent is associated with each MPI task. However, if the MP_IONODEFILE environment variable or the poe -ionodefile command is used, one I/O agent is associated with each task running on any of the nodes specified in the file referred to by MP_IONODEFILE or -ionodefile.

    PE MPI rounds up the number of bytes specified to an integral number of file blocks. The size of a file block is returned in the st_blksize field of the struct stat argument passed to the stat or fstat routine. For example, if IBM_io_buffer_size has a value of 23240, all data access operations on a file that belongs to a GPFS file system with a block size of 16KB will be performed as follows: the first 32KB of the file will be handled by the first I/O agent, all data access operations to the next 32KB of the file will be handled by the second I/O agent, and so on.

    Increasing the IBM_io_buffer_size value can improve performance when using large files, where large refers to hundreds of megabytes, particularly if the program uses collective data access operations.

    This hint only applies when the IBM_largeblock_io hint has a value of false. When IBM_largeblock_io is enabled, data striping across I/O agents is not performed.


IBM_largeblock_io
  • Default value: false
  • Valid values: switchable, true, false
  • Subroutines you can use to set it: MPI_FILE_OPEN, or, if there is no pending I/O operation: MPI_FILE_SET_INFO or MPI_FILE_SET_VIEW
  • Value consistency requirement: Consistent values are required at all participating tasks
  • Notes: Examples of applications that should benefit from using this hint are those in which each task accesses a large, contiguous chunk of the file, or in which the file is divided into distinct regions that are accessed by separate tasks. The hint value switchable, which can only be specified when calling MPI_FILE_OPEN, indicates that the hint value can be toggled between true and false until the file is closed. If the hint is specified as switchable on the call to MPI_FILE_OPEN, the hint value is set to false and can be toggled on calls to MPI_FILE_SET_INFO or MPI_FILE_SET_VIEW. If the hint is specified as true or false on the call to MPI_FILE_OPEN, the hint value cannot be changed by either MPI_FILE_SET_INFO or MPI_FILE_SET_VIEW. This hint can only be used if all tasks are being used for I/O: either the MP_IONODEFILE environment variable is not set, or it specifies a file that lists all nodes on which the application is running. For JFS files, this hint can only be set if all tasks are running on the same node.

IBM_sparse_access Lets you specify the future file access pattern of the application for the associated file. Specifically, you can specify whether the file access requests from participating tasks are sparse (the value is set to true) or dense (the value is set to false).
  • Default value: false
  • Valid values: true, false
  • Subroutines you can use to set it: MPI_FILE_OPEN, MPI_FILE_SET_INFO, MPI_FILE_SET_VIEW
  • Value consistency requirement: Consistent values are required at all participating tasks
  • Notes: In cases where each single MPI collective read or write operation touches most of the sections in a fairly large region of a file, this hint will not help. In cases where the entire range of each collective read/write is relatively small or, if the range is large and only widely-separated bits of the file are touched, it may improve performance. In this context, "section" refers to either the default or explicitly set IBM_io_buffer_size and "large" begins somewhere near (IBM_io_buffer_size*sizeof(MPI_ COMM_WORLD)).

Notes

When you open a file, the atomicity is set to false.

If you call MPI_FINALIZE before all files are closed, an error will be raised on MPI_COMM_WORLD.

Parameter consistency checking is only performed if the environment variable MP_EUIDEVELOP is set to yes. If this variable is set and the amodes specified are not identical, the error Inconsistent amodes will be raised on some tasks. Similarly, if this variable is set and the file inodes associated with the file names are not identical, the error Inconsistent file inodes will be raised on some tasks. In either case, the error Consistency error occurred on another task will be raised on the other tasks.

MPI-IO in PE MPI is targeted to the IBM General Parallel File System (GPFS) for production use. File access through MPI-IO normally requires that a single GPFS file system image be available across all tasks of an MPI job. PE MPI with MPI-IO can be used for program development on any other file system that supports a POSIX interface (AFS, DFS(TM), JFS, or NFS) as long as all tasks run on a single node or workstation. This is not expected to be a useful model for production use of MPI-IO. PE MPI can be used without all nodes on a single file system image by using the MP_IONODEFILE environment variable. See IBM Parallel Environment for AIX: Operation and Use, Volume 1 for information about MP_IONODEFILE.

When MPI-IO is used correctly, a file name will refer to the same file system at every task. In one detectable error situation, a file will appear to be on different file system types. For example, a particular file could be visible to some tasks as a GPFS file and to others as NFS-mounted.

The default for MP_CSS_INTERRUPT is no. If you do not override the default, MPI-IO enables interrupts while files are open. If you have forced interrupts to yes or no, MPI-IO does not alter your selection.

MPI-IO depends on hidden threads that use MPI message passing. MPI-IO cannot be used with MP_SINGLE_THREAD set to yes.

For AFS(R), DFS, and NFS, MPI-IO uses file locking for all accesses by default. If other tasks on the same node share the file and also use file locking, file consistency is preserved. If the MPI_FILE_OPEN is done with mode MPI_MODE_UNIQUE_OPEN, file locking is not done.

Because the actual file I/O is carried out by agent threads spread across all tasks of the job, hand-coded "optimizations" based on an assumption that I/O occurs at the task making the MPI-IO call are more likely to do harm than good. If this kind of optimization is done, set the IBM_largeblock_io hint to true. This will shut off the shipping of data to agents and cause file I/O to be done by the calling task.

Errors

Fatal errors:

MPI not initialized

MPI already finalized

Invalid communicator
comm is not a valid communicator.

Can't use an intercommunicator
comm is an intercommunicator.

Conflicting collective operations on communicator

Internal stat failed (MPI_ERR_IO)
An internal stat operation on the file failed.

Returning errors (MPI error class):

Pathname too long (MPI_ERR_BAD_FILE)
File name must contain less than 1024 characters.

Invalid access mode (MPI_ERR_AMODE)
amode is not a valid access mode.

Invalid file system type (MPI_ERR_OTHER)
filename refers to a file belonging to a file system of an unsupported type.

Invalid info (MPI_ERR_INFO)
info is not a valid info object.

Invalid file handle

Locally detected error occurred on another task (MPI_ERR_ARG)
Local parameter check failed on other task(s).

Inconsistent file inodes (MPI_ERR_NOT_SAME)
Local filename corresponds to a file inode that is not consistent with that associated with the filename of other task(s).

Inconsistent file system types (MPI_ERR_NOT_SAME)
Local file system type associated with filename is not identical to that of other task(s).

Inconsistent amodes (MPI_ERR_NOT_SAME)
Local amode is not consistent with the amode of other task(s).

Consistency error occurred on another task (MPI_ERR_ARG)
Consistency check failed on other task(s).

Permission denied (MPI_ERR_ACCESS)
Access to the file was denied.

File already exists (MPI_ERR_FILE_EXISTS)
MPI_MODE_CREATE and MPI_MODE_EXCL are set and the file exists.

File or directory does not exist (MPI_ERR_NO_SUCH_FILE)
The file does not exist and MPI_MODE_CREATE is not set, or a directory in the path does not exist.

Not enough space in file system (MPI_ERR_NO_SPACE)
The directory or the file system is full.

File is a directory (MPI_ERR_BAD_FILE)
The file is a directory.

Read-only file system (MPI_ERR_READ_ONLY)
The file resides in a read-only file system and write access is required.

Internal open failed (MPI_ERR_IO)
An internal open operation on the file failed.

Internal fstat failed (MPI_ERR_IO)
An internal fstat operation on the file failed.

Internal fstatvfs failed (MPI_ERR_IO)
An internal fstatvfs operation on the file failed.

Related information

MPI_FILE_CLOSE
MPI_FILE_SET_VIEW
MPI_FINALIZE


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]