What is OpenMP
OpenMP is an industry standard API of C/C++ and Fortran for shared memory parallel programming. OpenMP Architecture Review Board (ARB) consists of major compiler vendors and many research institutions. Common architectures include shared memory architecture (multiple CPUs shared global memory, uniform memory access (UMA), with typical shared memory programming model of OpenMP, Pthreads), distributed memory architecture (each CPU has own memory, non-uniform memory access ((NUMA), with typical message passing programming model of MPI), and hybrid architecture (UMA within one node or socket, NUMA across nodes or sockets, with typical hybrid programming model of hybrid MPI/OpenMP). Current architecture trend needs a hybrid programming model with three levels of parallelism: MPI between nodes or sockets, shared memory (such as OpenMP) on the nodes/sockets, and increased vectorization for lower level loop structures.
OpenMP has three components: Compiler directives and clauses, runtime libraries and environment variables. The compiler directives are only interpreted when OpenMP compiler option is turned on. OpenMP uses the "fork and join" execution model: Master thread forks new threads at the beginning of parallel regions; Multiple threads share work in parallel; And threads join at the end of parallel regions.
In OpenMP, all threads have access to the same shared global memory. Each thread has access to its private local memory. Threads synchronize implicitly by reading and writing shared variables. No explicit communication is needed between threads.
Major features in OpenMP 3.1 include:
- Thread creation with shared and private memory
- Loop parallelism and work sharing constructs
- Dynamic work scheduling
- Explicit and implicit synchronizations
- Simple reductions
- Nested parallelism
- OpenMP tasking
New features in OpenMP 4.0 (released in July 2013) include:
- Device constructs for accelerators
- SIMD constructs for vectorization
- Task groups and dependencies
- Thread affinity control
- User defined reductions
- Cancellation construct
- Initial support for Fortran 2003
- OMP_DISPLAY_ENV for all internal variables
New features in OpenMP 4.5 (released in November 2015) include:
- Significantly improved support for devices
- Support for doacross loops
- New taskloop construct
- Reductions for C/C++ arrays
- New hint mechanisms
- Thread affinity support
- Improved support for Fortran 2003
- SIMD extensions
OpenMP 4.0/4.5 Support in Compilers
- GNU compiler
- From gcc/4.9.0 for C/C++ and OpenMP 4.0
- From gcc/4.9.1 for Fortran with OpenMP 4.0
- From gcc/6.0 and most OpenMP 4.5 features
- From gcc/6.1 and full OpenMP 4.5 for C/C++ (not Fortran)
- Intel compiler
- From intel/15.0 with most OpenMP 4.0 features
- From Intel/16.0 with full OpenMP 4.0
- From intel/16.0 Update 2 and some OpenMP4.5 SIMD features
- Cray compiler
- From cce/8.4.0 with full OpenMP 4.0
For more information on Compiler support for OpenMP, click here.
Relevant NERSC Trainings on OpenMP:
- Advanced OpenMP and CESM Case Study by Helen He, March 2016.
- Advanced OpenMP Training. Dr. Michael Klemm, Intel; Dr. Bronis R. de Supinski, LLNL, February 2016.
- Nested OpenMP by Helen He, October 2015.
- Tutorial: Getting Up to Speed on OpenMP 4.0. Ruud van der Pas, August 2015.
- OpenMP Basics and MPI/OpenMP Scaling. Helen He. LBNL Computational Sciences Postdocs Training, Mar 2015.
- Intel OpenMP Training at NERSC (part 1, part 2, part 3, part 4). Jeongnim Kim, Intel. March 2015.
- Explore MPI/OpenMP Scaling on NERSC Systems. Helen He, NERSC Training, October 2014.
- OpenMP and Vectorization Training. Jack Deslippe, Helen He, Harvey Wasserman, Woo-Sun Yang, October 2014.
- Hybrid MPI/OpenMP Programming. Helen He, NERSC User Group Training, Feb 2013.
- Introduction to OpenMP. Matt Cordery, NERSC User Group Training, Feb 2013.
Below are a collection of some useful OpenMP resources and tutorials:
- Official OpenMP Web Site: OpenMP standards, API specifications, tutorials, forums, and a lot more other information and resources.
- OpenMP Affinity on KNL presented by Kent Milfield at IXPUG-ISC16 Workshop, June 2016.
- ANL Training Program on Exascale Computing, August 2015
- A "Hands On" Introduction to OpenMP: Part 1 | Bronis de Supinski, LLNL; Tim Mattson, Intel
- A "Hands On" Introduction to OpenMP: Part 2 | Bronis de Supinski, LLNL; Tim Mattson, Intel
- A "Hands On" Introduction to OpenMP: Part 3 | Bronis de Supinski, LLNL; Tim Mattson, Intel
- UC Berkeley ParLab Boot Camp, 2014
- Tim Mattson's (Intel) "Introduction to OpenMP" (2013) on YouTube: 27 video segments, 4 hrs total. slides, exercises.
- SC13 Tutorial: Hybrid MPI and OpenMP Parallel Programming.
- LLNL OpenMP Tutorial. Blaise Barney, LLNL
- ANL Training Program on Exascale Scale Computing, 2013
- Using OpenMP for Intranode Parallelism: Tutorial Overview. Bronis de Supinski, LLNL; Paul Petersen, Intel.
- Using OpenMP for Intranode Parallelism: Useful Information. Bronis de Supinski, LLNL; Paul Petersen, Intel.
- Using OpenMP for Intranode Parallelism. OpenMP 4.0 and the Future of OpenMP. Bronis de Supinski, LLNL.
- UC Berkeley ParLab Boot Camp, 2013
- Shared Memory Programming with OpenMP - Basics, Tim Mattson, Intel. Video.
- More about OpenMP - New Feature. Tim Mattson, Intel. Video (start from 1:25:20).
- Some tools for tuning OpenMP codes.
- Some other useful tools that can be useful for OpenMP codes are shown on the Performance and Debugging Tools page.