SystemsFranklin Home
About Franklin
Software
Status & StatsNERSC MOTD Announcements Known Problems Current Queue Look Completed Jobs List Job Stats
|
Programming on FranklinOn this page:
1. OverviewCode development on Franklin is typically performed on a Franklin "login node" under a standard SUSE Linux shell environment. See Franklin User Enviroment for more information about the interactive shell environment. The login nodes run a fully functional instance of SUSE Linux. The compute nodes, where all parallel jobs are executed, run an operating system known as CLE (Cray Linux Environment). CLE is designed to support high performance computing applications without the overhead of a full Linux distribution. CLE has only a limited number of system calls and Cray does not support run-time dynamic libraries under CLE. 1.1 Parallel CodesMost parallel codes running on Franklin are written in Fortran, C, or C++ to run in SPMD (Single Program, Multiple Data) mode, with explicit calls to the MPI libraries to communicate among tasks. Codes are typically compiled and linked on the login nodes. Cray provides a convenient set of commands that should be used in almost all cases for compiling and linking parallel programs:
If you invoke the compilers and linker with these names, you do not need to explicitly link with the MPI libraries or other Cray system software libraries. All the MPI and Cray system include directories are also transparently imported. There are man pages for ftn, cc, and CC on Franklin. The cray compiler commands ultimately invoke a third-party compiler suite to build your code. The Portland Group compilers are used by default. Pathscale, GNU, and Cray CCE compilers (including UPC compilers) are also available, To change the underlying compiler, use the module command.
franklin% module swap PrgEnv-pgi PrgEnv-pathscale !For Pathscale
franklin% module swap PrgEnv-pgi PrgEnv-gnu !For GNU
franklin% module swap PrgEnv-pgi PrgEnv-cray !For Cray and UPC
You must use these module commands to change the default base compilers at compile time and at run-time in your batch script. WARNING: Do not set the environment variables MPICH_CC, MPICH_F90, or MPICH_F77. Doing so will put the compilers into an infinite loop. 1.2 Serial CodesYou may need to build small serial codes meant to execute on the login nodes, or serially in your batch script. These executables should also be compiled with the ftn, cc, or CC wrappers. Serial codes that run on the login nodes or serially in your batch script should be short (less than 5 minutes), with a small memory footprint (less than 1 GB). The nodes that run these codes are shared among many users and can not support production computing runs. Serial codes that require longer times and/or larger memories should be run on the compute nodes; see Running Jobs on Franklin for more information. 2. Simple ExamplesHere is a basic example of how to compile a Fortran 90 MPI code into an parallel executable on Franklin.
franklin% ftn -fast -o simple.x simple.f90
This command compiles the source code (which includes MPI calls) and links with system libraries to produce a parallel executable named simple.x, which is ready to run on the Franklin compute nodes. The PGI compiler option -fast enables the basic optimization level recommended by NERSC. (Please see PGI Fortran Compiler on Franklin for more information on compiler options for PGI as well as other compilers.) The analagous examples for C and C++ are:
franklin% cc -fast -o simple.x simple.c
franklin% CC -fast -o simple.x simple.C
3. CompilersAs described in the overview, you should use these commands to build parallel code: ftn, cc, and/or CC. In the default evironment ftn, cc, and CC invoke the Portland Group (PGI) compilers. You can change the underlying compiler as described in the overview above. For more information, explore the following links:
4. MPICray's MPI library is based on MPICH-2 and implements the MPI 2.0 Standard, except for dynamic process spawn functions. MPI is implemented on top of Cray's Portals low-level communications interface. C++ codes must include "mpi.h" before any other include files. For more information: 4.1 MPI Rank AssignmentsThe distribution of MPI ranks on the nodes can be written to the standard output file by setting environment variable PMI_DEBUG to 1. Users can control the distribution of MPI tasks on the nodes using the environment variable MPICH_RANK_REORDER_METHOD. See MPI Task Distribution on Nodes and the "intro_mpi" man page for more information. 4.2 Some XT specific tuning for MPI program
5. SHMEM ProgrammingThe Cray SHared, distributed MEMory access (SHMEM) library is a set of logically shared, distributed memory access routines. Cray SHMEM library routines are similar to MPI library routines in that they both pass data among a set of parallel processors. SHMEM routines use one-sided put and get communications to remote address spaces. Cray SHMEM is implemented on top of the Portals low-level message passing scheme. As with MPI, a header file is required: ! For Fortran include 'mp/shmem.fh' # For C/C++ #include < Compiler wrappers will automatically link the SHMEM libraries: % ftn shmem_program.f % cc shmem_program.c % CC shmem_program.C Please refer to intro_shmem man page for more information about SHMEM. 5.1 Some XT specific tuning for SHMEM program
6. Executable File Sizes and Compile TimesConsider the following 33 byte Fortran source program:
/scratchdir => cat hello.f
print *,"Hello!"
end
When this code is compiled for serial execution on the login nodes
under a standard Linux environment that support dynamic loading, the
executable size is 2.2 MB using the PGI compilers, and 26.4 KB
using the GNU compiler.
However, when the same source code
is compiled with the cross-compiling wrapper ftn for the microkernel
environment on the compute nodes, where static loading is required,
the executable size is 13.0 MB using the PGI compilers and
11.1 MB using the GNU compilers. Executables for the parallel, compute
node environment are larger because of static linking.
If an attempt is made to statically link together an executable in excess of 2 GB, the linker will produce a truncation error message such as the following: ... : relocation truncated to fit: ...It is then generally necessary for the user to reduce large static arrays in the code, replacing them by dynamically allocated arrays. This problem is more common with older codes with large static arrays (or Fortran common arrays) which are used in various ways by subroutines as a user-managed dynamic memory area. Compile times may be significantly longer when cross-compiling for the static linking environment on the compute nodes because of the added I/O time required to make static copies of library routines. The object mode on Franklin is 64-bit, which means that all executables will run in 64-bit address mode. 6.1 Memory ConsiderationsEach quad-core node has about 7.38 GB of user accessible memory. When running in the default quad-core mode with 4 MPI tasks per node, MPI task will have access to about 1.85 GB of memory. Running with one or two MPI tasks per node will allow each MPI task to use 7.38 or 3.69 GB of user memory. Memory use by the MPI or shmem layer may grow as you move to higher processor counts. See Memory Usage Consideration on Franklin for more details. 6.2 Debugging and OptimizationThe basic debugging tool on Franklin is Distributed Debugging Tool (DDT) from Allinea Software. The Multi Core Report jointly produced by Cray, NERSC, and AMD presented dual core and quad core processor architectures, analyzed impact of multi core processors on the performance of selected micro and application benchmarks, and discussed compiler options and software optimization techniques. Please also refer to Important Portland Group Compiler Options for basic tuning with compiler option choices. Here is a collection of papers written by Stephen Whalen from Cray on Optimizing the NPB benchmarks for multi-core AMD Opteron Microprocessors. Many of the techniques described in these papers could be used in optimizing general applications.
7. Further Information |
![]() |
Page last modified: Mon, 11 Jan 2010 21:46:09 GMT Page URL: http://www.nersc.gov/nusers/resources/franklin/programming/ Web contact: webmaster@nersc.gov Computing questions: consult@nersc.gov Privacy and Security Notice |
![]() |