NERSCPowering Scientific Discovery Since 1974


MAP from Allinea Software is a parallel profiler with a simple graphical user interface. It is installed on Edison, Cori and Babbage.

Note that the performance of the X Windows-based MAP Graphical User Interface can be greatly improved if used in conjunction with the free NX software.


Allinea MAP is a parallel profiler with simple Graphical User Interface. MAP can be run with up to 512 processors, to profile serial, OpenMP and MPI codes.

The Allinea MAP web page and 'Allinea Forge User Guide' (available as $ALLINEA_TOOLS_DOCDIR/userguide-forge.pdf after loading an allineatools module or the Allineas Forge User Guide web page) are good resources for learning more about some of the advanced MAP features.

Loading the Allinea Tools Module

To use MAP, first load the 'allineatools' module to set the correct environment settings:

% module load allineatools

Compiling Code to Run with MAP

To collect performance data, MAP uses two small libraries: MAP sampler (map-sampler) and MPI wrapper (map-sampler-pmpi) libraries. These must be used with your program, which can be done relatively easily with dynamic linking on Babbage. There are somewhat strict rules regarding linking order among object codes and these libraries (please read the User Guide for detailed information). But if you follow the instructions printed by MAP utility scripts, then it is very likely your code will run with MAP.

Your program must be compiled with the -g option to keep debugging symbols, together with optimization flags that you would normally use. If you use the Cray compiler on the Cray machines, we recommend the -G2 option.

Below we show build instructions using a Fortran case, but the C or C++ usage is the same.

On Babbage

The libraries are built and preloaded at runtime when the default dynamic linking is used. So building an executable can be done easily:

% mpiifort -mmic -g -c testMAP.f
% mpiifort -mmic -o testMAP_ex testMAP.o

On Cray Machines

Buidling an executable for MAP is more complicated on Cray machines. First, you need to explicitly build the MAP sampler and MPI wrapper libraries using 'make-profiler-libraries', and link your executable against them.

To build a statically-linked executable, follow this procedure. It creates a plain text file 'allinea-profiler.ld' which contains suggested options for linking the map libraries. You only need to use '-Wl,@/your/directory/allinea-profiler.ld' flag with this file in your link command in order to use the options contained in the file.

% make-profiler-libraries --lib-type=static
Created the libraries in /your/directory:

To instrument a program, add these compiler options:
   compilation for use with MAP - not required for Performance Reports:
      -g (or '-G2' for native Cray Fortran) (and -O3 etc.)
   linking (both MAP and Performance Reports):
      -Wl,@/your/directory/allinea-profiler.ld ... EXISTING_MPI_LIBRARIES
   If your link line specifies EXISTING_MPI_LIBRARIES (e.g. -lmpi), then
   these must appear *after* the Allinea sampler and MPI wrapper libraries in
   the link line.  There's a comprehensive description of the link ordering
   requirements in the 'Preparing a Program for Profiling' section of either
   userguide-forge.pdf or userguide-reports.pdf, located in

% ftn -g -c testMAP.f        # Use -G2 instead of -g for the Cray compiler
% ftn -o testMAP_ex testMAP.o -Wl,@/your/directory/allinea-profiler.ld

To build a dynamically-linked executable, follow this procedure:

% make-profiler-libraries
Created the libraries in /your/directory:       (and .so.1, .so.1.0, .so.1.0.0)  (and .so.1, .so.1.0, .so.1.0.0)

To instrument a program, add these compiler options:
   compilation for use with MAP - not required for Performance Reports:
      -g (or '-G2' for native Cray Fortran) (and -O3 etc.)
   linking (both MAP and Performance Reports):
      -dynamic -L/your/directory -lmap-sampler-pmpi -lmap-sampler -Wl,--eh-frame-hdr

Note: These libraries must be on the same NFS/Lustre/GPFS filesystem as your program.

Before running your program (interactively or from a queue), set
   export LD_LIBRARY_PATH=/your/directory:$LD_LIBRARY_PATH
   mpirun  ...
or add -Wl,-rpath=/your/directory when linking your program.

% ftn -c -g testMAP.f          # Use -G2 for the Cray compiler
% ftn -dynamic -o testMAP_ex testMAP.o -L/your/directory -lmap-sampler-pmpi -lmap-sampler -Wl,--eh-frame-hdr

Save the information about how to reset the LD_LIBRARY_PATH because you will need it before you run MAP.

Remember that you can provide an optional argument to 'make-profiler-libraries' to build the libraries in a directory other than the current working directory.

Starting a Job with MAP

Running an X window GUI application can be painfully slow when it is launched from a remote system over internet. NERSC recommends to use the free NX software because the performance of the X Window-based DDT GUI can be greatly improved. Another way to cope with the problem is to use Allinea's remote client, which will be discussed in the next section.

You must log in with an X window forwarding enabled.  One way of ensuring this is to use the -XY flag with the ssh command.

% ssh -XY

After loading the allineatools module and compiling with the -g option, request an interactive batch session on Edison, Cori, or Babbage.

% salloc -p debug -N numNodes                            # SLURM scheduler on Edison, Cori or Babbage

Load the 'allineatools' module if you haven't loaded it yet:

% module load allineatools

If you are profiling with a dynamically linked executable and you explicitly created the libraries that MAP needs, using a make-map-* commmand, run the command to modify the LD_LIBRARY_PATH that you saved when you ran the command:

% setenv LD_LIBRARY_PATH /your/directory:$LD_LIBRARY_PATH     # for csh/tcsh

$ export LD_LIBRARY_PATH=/your/directory:$LD_LIBRARY_PATH     # for bash/sh/ksh

Then, run the map command followed by the name of the executable to profile:

% map ./testMAP_ex     # or 'map -n ... ./testMAP_ex', 'map -np ... ./testMAP_ex'

or, starting from version 5.1,

% forge ./testMAP_ex

or 'allinea-forge' with versions 5.0.x.

The Allinea Forge GUI will pop up with a start up menu. For profiling choose the option PROFILE with the 'allinea MAP' tool.  You can also choose to LOAD PROFILE DATA FILE to view profiling results saved in a file created in a previous MAP run.

DDT Submit window

Next a submission window will appear with a prefilled path to the executable to run. Select the number of processors on which to run and press Run. To pass command line arguments to a program enter them in the aprun arguments box.

DDT Submit window

Running MAP on Babbage requires some additional steps. Let's assume that two MIC cards, bc1012-mic0 and bc1012-mic1, are assigned to your batch job:

% get_micfile                            # to get hostfile for MIC cards
% cat micfile.$SLURM_JOB_ID # hostfile generated by get_micfile

Load the allineatools module and start map as before. One way that MAP works is to run the executable in MPMD (Multiple Program Multiple Data) mode when you want to run on more than one MIC card, as though you were running separate executables, one on each MIC card. The following is how to specify this in the Run window. Please note that the sum of the processes (8 for this example) should be the sum of the MPI tasks over all cards.

DDT Submit window

On Babbage you may see the following error message:

Other: ERROR: object '' from LD_PRELOAD cannot be preloaded: ignored.
Other: ERROR: object '/global/homes/w/wyang/.allinea/wrapper/' from LD_PRELOAD cannot be preloaded: ignored.

Allinea suggests to ignore the message at this time.

MAP will start your program and collect performance data from all processes.

DDT Submit window

By default, MAP lets your program run to completion and will display data for the entire run.  You can also use the 'Stop and Analyze' button and the menu beneath it to control how long to profile your program.

Remote Client

Allinea provides remote clients for Windows, OS X and Linux that can run on your local desktop to connect via SSH to NERSC systems to debug, profile, edit and compile files directly on the remote NERSC machine. You can download the clients from Allinea and install on your laptop/desktop. Please note that the client version must be the same as the Allinea version that you're going to use on the NERSC machines.

For configuring the client for NERSC systems, follow the similar steps shown in the DDT web page. If you have done configuration for using DDT on a NERSC machine, the same configuration will be used for running MAP.

You can start MAP similarly. Select a NERSC machien for the 'allinea MAP' tool and login to the machine.

DDT Submit window

Click the 'PROFILE' button for the 'allinea MAP' tool. Set the run parameters and click 'Submit'. 

DDT Submit window 

Profiling Results

After completing the run, MAP displays the collected perfromance data using GUI.

DDT Submit window

The window is made of a few sections, providing different view points in presenting collected performance data.

Metrics View

The top section shows the "Metrics view," displaying a timeline of a few selected performance data. By default it shows 'Main thread activity', 'CPU floating-point (%)' for the percentage of time each rank spends in floating-point CPU instruction, and 'Memory usage (MB)' for each task's memorage usage.

Each vertical slice shows the distribution of values across (MPI) tasks at the moment. The minimum, maximum and the mean are displayed, and shading gives you an idea about how data is clustered. A region of large load imbalnce can be visually identified with a fat shaded region.

You can add more metrics (such as 'CPU floating-point' (instructions), 'CPU fp vector' (instructions), 'CPU time', 'Kernel-mode CPU time', 'MPI call duration', 'MPI point-to-point', etc.) to the view area by clicking the 'Metrics' button at the bottom and then adding the ones from the list that interest you. The metrics are available under metric menu groups: 'Activity Timelines', 'CPU Instructions', 'CPU Time', 'IO', 'Memory', and 'MPI'.

Source Code View

The center pane shows the source code, annotated with performance information to the left of each line. It shows how much total time was spent computing (dark green), communicating (blue) and I/O (orange) on that line. In a OpenMP parallel region, light green is used for multi-threaded computation time and dark grey is used for thread idle time. This coloring scheme applies to the other area, too. Only lines that spent at least 0.1% of the total time get charts.

Stacks View

The "Parallel Stacks View" area (shown when selecting the 'Main Thread Stacks ' tab in the bottom pane) lists the lines where a large wall time was spent, sorted by wallclock time. Clicking on any line jumps the code view to that position in the source code pane.

Functions View

The "Functions View", which is displayed when selecting the 'Functions' tab, shows a flat profile of the functions in your program. This is what you would see with a typical profiler tool. The value in the 'Self' column is for the time spent in the function itself (so called the "exclusive" time), the value in the 'Total' column is for the time in the function itself and all its callees (so called the "inclusive" time), and the one in the 'Child' shows the time spent in the callees only.

Project Files View

The "Project Files View" area (shown when selecting the 'Project Files' tab) offers a way to browse around and navigate through the codes. You can view functions arranged under source files. 'External Code' is typically system libraries.

 When you hover your mouse over the metrics view area, a thin hairline will appear and distribution information for the selected performance metric (i.e. the metric window where your mouse's cursor is located) will be displayed at the bottom of the metrics view area. Similar hairlines will appear in the source code pane and the bottom pane, and they move in sync with the top hairline.

One can also select a region of interest in the horizontal axis (wallclock time) by clicking the left mouse botton, dragging the mouse and then releasing the mouse button. The selected region will appear highlighted. The center and bottom pane's contents will be adjusted by the selection.

DDT Submit window

MAP saves profiling results in a file, '' where '#' is for the process count and yyyy-mm-dd_HH-MM' is the time stamp.

% ls -l
-rw-------  1 wyang wyang   273822 Apr  4 17:16

You can save this file to run MAP on it to examine the profiling results later:

% map

Running in Command Line Mode

MAP can be run from the command line without GUI, by using the '-profile' option. You can submit a batch job as follows:

% cat runit
#SBATCH -p debug
#SBATCH -t 10:00

module load allineatools
map --profile --np=24 ./jacobi_mpi

% sbatch runit
Submitted batch job 1054621

% cat slurm-1054621.out
Allinea Forge 6.0.1-46365 - Allinea MAP
Profiling             : /global/cscratch1/sd/wyang/debugging/jacobi_mpi 
Allinea sampler       : statically linked
MPI implementation : Auto-Detect (Cray X-Series (MPI/shmem/CAF)) * number of processes : 24 * Allinea MPI wrapper : statically linked MPI enabled : Yes * MPI implementation : SLURM (MPMD) * number of processes : 24 * number of nodes : 1 * Allinea MPI wrapper : statically linked MAP analysing program... MAP gathering samples... MAP generated /global/cscratch1/sd/wyang/debugging/ 1 38.97168 ... 20 4.573649 ... % ls -l ... -rw------- 1 wyang wyang 146101 Feb 1 12:21

Trouble Shooting

If you are having trouble launching MAP try these steps.

Make sure you have the most recent version of the system.config configuration file. The first time you run MAP, you pick up a master template which then gets stored locally in your home directory in ~/.allinea/${NERSC_HOST}/system.config where ${NERSC_HOST} is the machine name: edison, cori, or babbage. If you are having problems launching MAP you could be using an older verion of the system.config file and you may want to remove the entire directory:

% rm -rf ~/.allinea/${NERSC_HOST}  

Remove any stale processes that may have been left by MAP.

% rm -rf $TMPDIR/allinea-$USER 

In case of a font problem where every character is displayed as a square, please delete the .fontconfig directory in your home directory and restart ddt.

% rm -rf ~/.fontconfig

Make sure you are requesting an interactive batch session on Edison, Cori and Babbage. NERSC has configured MAP to run from the interactive batch jobs.

% salloc -p debug -N numNodes                           # on Edison, Cori or Babbage

Finally make sure you have compiled your code with -g. If none of these tips help, please contact the consultants at 

Installed Versions

PackagePlatformCategoryVersionModuleInstall DateDate Made Default
Allinea tools babbage applications/ debugging 5.1-43967 allineatools/5.1-43967 2015-10-19 2015-10-19
Allinea tools babbage applications/ debugging 6.0.1 allineatools/6.0.1 2016-02-01 2016-02-01
Allinea tools cori applications/ debugging 5.1-43967 allineatools/5.1-43967 2015-10-15 2015-10-15
Allinea tools cori applications/ debugging 6.0 allineatools/6.0 2015-12-21 2016-02-01
Allinea tools cori applications/ debugging 6.0.1-46365 allineatools/6.0.1-46365 2016-02-01 2016-02-01
Allinea tools edison applications/ debugging 5.1-43967 allineatools/5.1-43967 2015-10-19 2015-10-19
Allinea tools edison applications/ debugging 6.0.1-46365 allineatools/6.0.1-46365 2016-02-05 2016-02-05