SoftwareCompilersLibraries Applications Tools & Utilities Software by PlatformHopperFranklin Carver Euclid PDSF HPSS Affiliated CollectionsACTS Collection
|
Using OpenMP on IBM SPOpenMP supports multi-platform shared-memory parallel programming in C/C++ and Fortran on many architectures. This document describes how to compile and run OpenMP programs on IBM SP systems. Contents
Additional Information
Basic usageOpenMP provides an easy method for SMP-style parallelization of discrete, small sections of code, such as a do loop. This can be very helpful for code development and testing. However, OpenMP has a number of limitations which make it less desirable than MPI for large scale computations.
OpenMP is available for
To compile and run a Fortran code using OpenMP use: % xlf90_r -qsmp=omp -o exename filename.f % ./exename To compile and run a C code using OpenMP use: % xlc_r -qsmp=omp -o exename filename.c % ./exename To compile and run a C++ code using OpenMP use: % xlC_r -qsmp=omp -o exename filename.C % ./exename It should be noted that the -qsmp=omp option is required for both the compile step and the link step. A program built in this way will automatically use a number of threads equal to the number of processors on the node. Here's a small example code that prints out the number of threads created.
! Filename: threads.f
! Compile: xlf90_r -o threads -qsmp=omp threads.f
PROGRAM HELLO
IMPLICIT NONE
INTEGER nthreads, tid, OMP_GET_NUM_THREADS
INTEGER OMP_GET_THREAD_NUM
! Fork a team of threads
!$OMP PARALLEL PRIVATE(nthreads, tid)
! Obtain and print thread id
tid = OMP_GET_THREAD_NUM()
print *, 'Hello World from thread ', tid
! Only master thread does this
IF (tid .EQ. 0) THEN
nthreads = OMP_GET_NUM_THREADS()
print *, 'Number of threads ', nthreads
END IF
! All threads join master thread and disband
!$OMP END PARALLEL
END
The same small example code in C is shown below:
/* Filename: threads.c
Compile: xlc_r -o threads -qsmp=omp threads.c */
#include "omp.h"
int main ()
{
int nthreads, tid;
/* Fork a team of threads */
#pragma omp parallel private(nthreads, tid)
{
/* Obtain and print thread id */
tid = omp_get_thread_num();
printf("Hello World from thread %d\n", tid);
/* Only master thread does this */
if (tid==0) {
nthreads = omp_get_num_threads();
printf("Number of threads %d\n", nthreads);
}
}
return 0;
}
The same small example code in C++ is shown below. Note that printf is used for output rather than the stream cout. This is because printf produces more coherent output for multiple threads; different parts of the cout streams would be mixed in the output from the different parallel threads.
// Filename: threads.C
// Compile: xlC_r -o threads -qsmp=omp threads.C
#include <iostream>
#include <omp.h>
int main ()
{
int nthreads, tid;
// Fork a team of threads
#pragma omp parallel private(nthreads, tid)
{
// Obtain and print thread id
tid = omp_get_thread_num();
printf("Hello World from thread %d\n", tid);
// Only master thread does this
if (tid==0) {
nthreads = omp_get_num_threads();
printf("Number of threads %d\n", nthreads);
}
}
return 0;
}
Compiling and running on on the IBM SP is as follows: % xlf90_r -o threads -qsmp=omp threads.f ** hello === End of Compilation 1 === 1501-510 Compilation successful for file threads.f. % ./threads Hello World from thread 8 Hello World from thread 0 Number of threads 8 Hello World from thread 3 ... % xlc_r -o threads -qsmp=omp threads.c % ./threads Hello World from thread 0 Number of threads 8 Hello World from thread 5 ... % xlC_r -o threads -qsmp=omp threads.C % ./threads Hello World from thread 0 Number of threads 8 Hello World from thread 8 ... Note that you do not have to use poe for pure OpenMP codes that are intended to run on a single node. Changing the number of threads and tasksYou can change the number of threads by setting the OMP_NUM_THREADS environment variable. The deafult is to use the same number of threads as cpus available on a node. For example, to create 8 threads on a single node % setenv OMP_NUM_THREADS 8 % ./threads Hello World from thread 0 Number of threads 8 Hello World from thread 1 Hello World from thread 2 Hello World from thread 3 Hello World from thread 4 Hello World from thread 5 Hello World from thread 6 Hello World from thread 7 The same thing may be accomplished by using poe to request one task on a single node. That one task will run OMP_NUM_THREADS threads. % poe ./threads -nodes 1 -tasks_per_node 1 The environment variable XLSMPOPTS can be used to control the behavior of OpenMP threads (including the number of threads). Running on more than one nodeYou can use poe to run on more than a single node. However, the nodes can not communicate using only OpenMP; see "Mixing OpenMP and MPI" in the next section. Set -nodes to the number of nodes, -tasks_per_node to 1, and OMP_NUM_THREADS to whatever you wish, or use the default. For example, this will run on 2 nodes with the default number of OMP threads per node: % unsetenv OMP_NUM_THREADS % poe ./threads -nodes 2 -tasks_per_node 1 Here is an analogous LoadLeveler script that compiles and runs the three examples above: #@ class = debug #@ shell = /usr/bin/csh #@ node = 2 #@ tasks_per_node = 1 #@ network.MPI = csss,not_shared,us #@ wall_clock_limit = 00:02:00 #@ notification = complete #@ job_type = parallel #@ output = $(jobid).$(stepid).out #@ error = $(jobid).$(stepid).out #@ environment = COPY_ALL #@ queue set echo xlf90_r -o threads -qsmp=omp threads.f poe ./threads xlc_r -o threads -qsmp=omp threads.c poe ./threads mpxlf90_r -o threads -qsmp=omp threads.f ./threads mpcc_r -o threads -qsmp=omp threads.c ./threads exit Note that poe is needed in this script when the code was compiled with a "serial" version of the compiler. Without poe the code will not run on more than a single node. However, if a "parallel" version of the compiler, such as mpxlf90_r or mpcc_r, is used to create the executable then poe does not need to to be used on the command line. The use of poe in batch scripts can be confusing, because LoadLeveler keywords will override poe command line options. Mixing OpenMP and MPIOpenMP and MPI can be freely mixed in Fortran source code. You must use a "multiprocessor" and "thread-safe" compiler invocation with the -qsmp=omp option, e.g.,
Some users have reported cases where this mixed-mode programming strategy increases a code's runtime performance. Here's the same code as above, but with some MPI calls mixed in:
! Filename: hello.f
! Compile: mpxlf90_r -o hello -qsmp=omp hello.f
! Run: poe ./hello -nodes 2 -tasks_per_node 1
PROGRAM HELLO
IMPLICIT NONE
INCLUDE 'mpif.h'
INTEGER nthreads, tid, OMP_GET_NUM_THREADS
INTEGER OMP_GET_THREAD_NUM, myid, ierr, nprocs
CHARACTER*32 buf
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, nprocs, ierr )
print *, "MPI Process number ", myid, " of ", nprocs, " is alive"
! Fork a team of threads on each MPI task
!$OMP PARALLEL PRIVATE(nthreads, tid)
! Obtain and print thread id
tid = OMP_GET_THREAD_NUM()
! print *, 'Hello World from OMP thread ', tid, 'on process ',myid
! Only master thread does this
IF (tid==0) THEN
nthreads = OMP_GET_NUM_THREADS()
print *, 'Number of OMP threads ', nthreads, 'on process ',myid
END IF
! All threads join master thread and disband
!$OMP END PARALLEL
if (myid==0) buf='an MPI message from process 0'
call MPI_BCAST(buf,32,MPI_CHARACTER,0,MPI_COMM_WORLD,ierr)
if(myid/=0) print *, 'Process ', myid, "got ", buf
call MPI_FINALIZE(ierr)
END
Here is the same OMP/MPI example in C:
/* Filename: hello.c
Compile: mpcc_r -o hello -qsmp=omp hello.c
Run: poe ./hello -nodes 2 -tasks_per_node 1 */
#include "mpi.h"
#include "omp.h"
int main(int argc, char* argv[])
{
int nthreads, tid;
int myid, nprocs;
char buf[32];
MPI_Init(&argc, &argv); /* start MPI */
MPI_Comm_rank(MPI_COMM_WORLD, &myid); /* get my proc id */
MPI_Comm_size(MPI_COMM_WORLD, &nprocs); /* get no.r of procs */
printf("MPI Process number %d of %d is alive\n", myid, nprocs);
/* Fork a team of threads */
#pragma omp parallel private(nthreads, tid)
{
/* Obtain thread id */
tid = omp_get_thread_num();
/* Only master thread does this */
if (tid==0) {
nthreads = omp_get_num_threads();
printf("Number of threads %d on process %d\n",
nthreads, myid);
}
}
if (myid==0) { strcpy(buf,"an MPI message from process 0"); }
MPI_Bcast(buf,32,MPI_CHARACTER,0,MPI_COMM_WORLD);
if (myid!=0) {printf("Process %d got %s\n", myid, buf); }
MPI_Finalize(); /* finish MPI */
return 0;
}
Finally, here is the same OMP/MPI example in C++:
// Filename: hello.C
// Compile: mpCC_r -o hello -qsmp=omp hello.C
// Run: poe ./hello -nodes 2 -tasks_per_node 1
#include <iostream>
#include <mpi.h>
#include <omp.h>
int main(int argc, char* argv[])
{
int nthreads, tid;
int myid, nprocs;
char buf[32];
MPI_Init(&argc, &argv); // start MPI
MPI_Comm_rank(MPI_COMM_WORLD, &myid); // get my processor id
MPI_Comm_size(MPI_COMM_WORLD, &nprocs); // get number of procs
printf("MPI Process number %d of %d is alive\n", myid, nprocs);
// Fork a team of threads
#pragma omp parallel private(nthreads, tid)
{
// Obtain thread id
tid = omp_get_thread_num();
// Only master thread does this
if (tid==0) {
nthreads = omp_get_num_threads();
printf("Number of threads %d on process %d\n",
nthreads, myid);
}
}
if (myid==0) { strcpy(buf,"an MPI message from process 0"); }
MPI_Bcast(buf,32,MPI_CHARACTER,0,MPI_COMM_WORLD);
if (myid!=0) {printf("Process %d got %s\n", myid, buf); }
MPI_Finalize(); // finish MPI
return 0;
}
To compile: % mpxlf90_r -o hello -qsmp=omp hello.f ** hello === End of Compilation 1 === 1501-510 Compilation successful for file hello.f. or % mpcc_r -o hello -qsmp=omp hello.c or % mpCC_r -o hello -qsmp=omp hello.C and to run on two nodes with 1 MPI process per node and the default of 16 OpenMP threads per node: % poe ./hello -nodes 2 -tasks_per_node 1 Here's a LoadLeveler script to run the code on two nodes with 2 total MPI tasks and 16 OMP threads per node. #@ class = debug #@ shell = /usr/bin/csh #@ node = 2 #@ tasks_per_node = 1 #@ network.MPI = csss,not_shared,us #@ wall_clock_limit = 00:02:00 #@ notification = complete #@ job_type = parallel #@ output = $(jobid).$(stepid).out #@ error = $(jobid).$(stepid).err #@ environment = COPY_ALL #@ queue ./hello exit After the job completes this is the standard output file: MPI process number 1 of 2 is alive MPI process number 0 of 2 is alive Number of OMP threads 16 on process 0 Number of OMP threads 16 on process 1 Process 1 got an MPI message from process 0 |
![]() |
Page last modified: Mon, 11 Jan 2010 21:25:19 GMT Page URL: http://www.nersc.gov/nusers/resources/software/ibm/openmp.php Web contact: webmaster@nersc.gov Computing questions: consult@nersc.gov Privacy and Security Notice |
![]() |