NERSCPowering Scientific Discovery Since 1974

Cluster Compatibility Mode

Edison  compute nodes run a stripped down Linux operating system called Compute Node Linux (CNL).  Some standard Linux services, such as ssh, rsh, nscd, and ldap, are not supported on the compute nodes.  As a result, some user applications can not run in Edison's native enviornment.  Cluster Compatibility Mode (CCM) is the Cray software solution to this problem. It provides the standard Linux services needed to run most cluster-based independent software vendor (ISV) applications on Cray machines. Under CCM, everything an application "sees" is a standard Linux cluster. CCM is made availble on Edison to accommodate workflows that need  standard Linux services. Codes such as G09, Wien2k, Matlab, and other ISV applications can run under CCM.

Programming

Cray CCM is an exeuction environment where you can run ISV applications "out-of-the-box".  Applications that are compiled on any other x86_64 platform can run under Edison CCM directly if you provide the needed runtime environment.  Users can also compile codes for CCM on Edison. All compilers available to Edison's native programming environment (Extreme Scalability Mode, ESM hereafter) are available under the CCM environment as well; the PGI, GNU, Intel, Pathscale, and Cray compilers are all available for CCM. Please note that parallel tools and libraries that are optimized for Edison's native programming environment don't work under CCM. For example, the Cray custom MPICH2 libraries (xt-mpich2 modules) don't work under CCM, and therefore all parallel libraries that use the mpich2 libraries don't work with CCM. In addition, the compiler wrappers ftn, cc, and CC, should not be used under CCM. As an alternative, we have provided the OpenMPI library for CCM. MPI runs over TCP/IP and uses the OFED interconnect protocol over the Gemini High Speed Network (HSN). MPI codes need to be compiled for CCM by linking to the OpenMPI libraries. Users compile codes either using the native compiler calls, eg., pgif90, pgcc, pgCC, or the parallel compiler wrappers provided with OpeMPI. In contrast to normal Edison compilation, the executables built for CCM are linked dynamically by default, as you would expect on a generic Linux cluster. And since the compiler wrappers from OpenMPI don't handle libraries other than MPI and the compiler’s own libraries, you need to provide the include paths to the header files, the library paths, and the libraries on the compile/link lines.

To compile codes to run under CCM, you need to load the openmpi-ccm module. Please note that we add "-ccm" in the module name to remind users to use it under CCM only.

module load openmpi-ccm
mpif90 test1.f90

We have provided the most commonly used libraries for CCM. The ScaLapck libraries have been compiled for CCM (module name scalapack_ccm). In addition, all serial and threaded libraries built for Edison ESM should work under CCM (please use with caution, as we haven't confirmed all of them). We have tested that the ACML and FFTW (serial routines) that were built for ESM work under CCM. In addition, users can use the libraries available on Carver, and link their codes directly to Carver libraries. 

Running Jobs

To access CCM on Edison, you need to submit jobs to the ccm-queue. CCM jobs use "MOM" nodes for job launch. Instead of using the ALPS job launcher aprun, CCM jobs are launched from a MOM node to the compute nodes using the ccmrun command. The ccmrun command places a single instance of the execution commands on the head node of the allocated compute nodes (hereafter, we will call the nodes allocated and configured for CCM jobs as CCM nodes), and then the head node is responsible for launching the executables on the rest of the CCM nodes (remote CCM nodes hereafter) through whatever mechanisms used by the execution commands, e.g., mpirun (via ssh). There is another command, ccmlogin, which allows for interactive access to the head node of the CCM nodes. The ccmrun and ccmlogin commands wrappers the aprun command. Please refer to the man page for ccmrun and ccmlogin (you first need to load the ccm module to access these man pages, and can do this only on MOM nodes, because the ccm module is available on MOM nodes only).

Please note, Cray doesn't support the native torque launching mechanisms, so we had to build OpenMPI without batch system awareness (configured with --tm=disable). Therefore the job launcher mpirun from OpenMPI does not pass the environment on to the remote nodes. There are a few ways to pass the environment to the remote CCM nodes.

  1. Add environment variables in your shell startup file, .bashrc.ext and .cshrc.ext, and load the modules that define needed runtime environment variables in the same file as well.
  2. Use --prefix and -x options from mpirun command line options.
  3. Define environment variables in the batch job script and save them in the file ~/.ssh/environment, ie., after defining all the environment variables and loading the modules, do   
env > ~/.ssh/environment

 We recommend the first method, which appears to be most reliable.

Sample job scripts to run CCM jobs on Edison

In the following job scripts, we assume you have loaded the openmpi-ccm modules in your shell startup file, i.e. you have the following line in your Edison if block  in your ~/.bashrc.ext or ~/.cshrc.ext file. 

module load openmpi-ccm

Otherwise you need to invoke the mpirun command with the --prefix command line option or with the full path. Eg., you can use "mpirun --prefix /usr/common/usg/openmpi/default/pgi" or  "/usr/common/usg/openmpi/default/pgi/bin/mpirun" to replace the mpirun in the job scripts below.

To run G09, NAMD replica simulation, and WIEN2k, please refer to the website for each application.

A sample job script for running an MPI job:

#!/bin/bash -l
#PBS -N test_ccm
#PBS -q ccm_queue
#PBS -l mppwidth=48,walltime=30:00
#PBS -j oe

cd $PBS_O_WORKDIR
module load ccm
# export CRAY_ROOTFS=DSL (not needed any more since it is now default)
mpicc xthi.c
ccmrun mpirun -np 48 -hostfile $PBS_NODEFILE ./a.out

A sample job script to run MPI+OpenMP job under CCM:

#!/bin/bash -l
#PBS -N test_ccm
#PBS -q ccm_queue
#PBS -l mppwidth=48,walltime=30:00
#PBS -j oe

cd $PBS_O_WORKDIR
module load ccm
# export CRAY_ROOTFS=DSL (not needed any more since it is now default)
mpicc -mp xthi.c
export OMP_NUM_THREADS=6

ccmrun mpirun -np 8 -cpus-per-proc 6 -bind-to-core -hostfile $PBS_NODEFILE –x OMP_NUM_THREADS ./a.out

A sample job script to run multiple serial jobs under CCM:

#!/bin/bash -l
#PBS -q ccm_queue
#PBS -l mppwidth=24
#PBS –l walltime=1:00:00

cd $PBS_O_WORKDIR
module load ccm
# export CRAY_ROOTFS=DSL

ccmrun multiple_serial_jobs.sh

Where the script, multiple_serial_jobs.sh, looks like this:

% cat multiple_serial_jobs.sh
./a1.out &
./a2.out &

./a24.out &
wait

Known issues with CCM

Please refer to the Hopper CCM website for more information about CCM and how to use it.