NERSCPowering Scientific Discovery Since 1974

DDT

Distributed Debugging Tool (DDT) from Allinea Software is a parallel debugger installed on Hopper, Carver and Euclid.

The performance of the X Windows-based DDT Graphical User Interface can be greatly improved if used in conjunction with the free NX software.

Introduction

DDT is a parallel debugger which can be run with up to 8192 processors. It can be used to debug serial, OpenMP, MPI, Coarray Fortran (CAF), UPC (Unified Parallel C) codes. It also supports GPU debugging, but NERSC doesn't currently have a license on Dirac.

Totalview users will find DDT has very similar functionality and an intuitive user interface. All of the primary parallel debugging features from Totalview are available with DDT.

The Allinea DDT web page and 'DDT User Guide' (available as $DDT_DOCDIR/userguide.pdf after loading a ddt module) is a good resource for learning more about some of the advanced DDT features.

Loading the DDT Module

To use DDT at NERSC, first load the DDT module to set the correct environment settings with the following command:

% module load ddt

 

Compiling Code to Run with DDT

In order to use DDT, code must be compiled with the -g option. We also recommend that you do not run with optimization turned on, flags such as -fast.

A Fortran example:

% ftn -g -o testDDT_ex testDDT.f        # on Hopper
% mpif90 -g -o testDDT_ex testDDT.f # on Carver or Euclid

A C example:

% cc -g -o testDDT_ex testDDT.c         # on Hopper
% mpicc -g -o testDDT_ex testDDT.c      # on Carver or Euclid

Starting a Job with DDT

Be sure to log in with an X window forwarding enabled. This could mean using the -X or -Y option to ssh. The -Y option often works better for Mac OSX.

% ssh -X username@hopper.nersc.gov

After loading the DDT module and compiling with the -g option, request and interactive session on Hopper and Carver. On Euclid, you run DDT interactively, without going through a batch session.

% qsub -I -V -q debug -lmppwidth=numCores                      # Hopper
% qsub -I -V -q debug -lnodes=numNodes:ppn=numTasksPerNode     # on Carver

Then launch the debugger with the ddt command followed by the name of the executable to debug:

% ddt testDDT_ex

The DDT GUI will pop up and ask What would like to do? For basic debugging choose the option Run and Debug a Program . A user can also choose to Open a Core File . Attach to a Running Program is not yet available.

DDT Submit Window

Then a submission window will appear with a path to the executable to debug. Select the number of processors on which to run and press run. To pass command line arguments to a program enter them in the Arguments box.

DDT Submit Window

Trouble Shooting

If you are having trouble launching DDT try these steps.

Make sure you have the most recent version of the config.ddt configuration file. The first time you run DDT, you pick up a master template which then gets stored locally in your home directory in ~/.ddt/config_${NERSC_HOST}.ddt where ${NERSC_HOST} is the machine name: hopper, carver or euclid. If you are having problems launching DDT you could be using an older verion of the config.ddt file.

% rm -rf ~/.ddt  

Remove any stale processes that may have been left by DDT.

% rm -rf $TMPDIR/.ddt-$USER 

In case of a font problem where every character is displayed as a square, please delete the .fontconfig directory in your home directory and restart ddt.

% rm -rf ~/.fontconfig

Make sure you are requesting an interactive batch session on Hopper and Carver. NERSC has configured DDT to run from the interactive batch jobs.

% qsub -I -V -q debug -lmppwidth=numCores                      # Hopper
% qsub -I -V -q debug -lnodes=numNodes:ppn=numTasksPerNode     # on Carver

Finally make sure you have compiled your code with -g. A large number of users who are having trouble running with parallel debuggers forget to compile their codes with debugging flags turned on. If none of these tips help, please contact the consultants at consult@nersc.gov

 

Basic Debugging Functionality

The DDT GUI interface should be intuitive to anyone who has used a parallel debugger like Totalview before. Users can set breakpoints, step through code, set watches, examine and change variables, dive into arrays, dereference pointers, view variables across processors, step through processors etc. Please see the DDT Users Guide if you have trouble with any of these basic features.     

Useful DDT Features

Process Groups

With DDT, the user can easily change the debugger to focus on a single process or group of processes. If Focus on current Processor is chosen, then stepping through the code, setting a breakpoint etc will occur only for a given processor. If Focus on current Group is chosen then the entire group of processors will advance when stepping forward in a program and a breakpoint will be set for all processors in a group.

Similary, when Focus on current Thread is chosen, then all actions are for an OpenMP thread. DDT doesn't allow to create a thread group. However, one can click the Step Threads Together box to make all threads to move together inside a parallel region. In the image shown above, this box is grayed out simply because the code is not an OpenMP code.

A user can create new sub-groups of processors in several ways. One way is to click on the Create Group button at the bottom of the Process Group Window. Another way is to right-click in the Process Group Window to create a group and then drag the desired processors to the group. Groups can also be created more efficiently using sub-groups from the Parallel Stack View described below. The below image shows 3 different groups of processors, the default All group, a group with only a single master processor Master and a group with the remaining Workers processors.

Parallel Stack View

A feature which should help users debug at high concurrencies is DDT's Parallel Stack View window which allows the user to see the position of all processors in a code at the same time from the main window. A program is displayed as a branching tree with the number and location of each processor at each point. Instead of clicking through windows to determine where each processor has stopped, the Parallel Stack View presents a quick overview which easily allows users to identify stray processes. Users can also create sub-groups of processors from a branch of the tree by right clicking on the branch. A new group will appear in the Process Group Window at the top of the GUI.

Parallel Stack View

Memory Debugging

DDT has a memory debugging tool that can show heap memory usage across processors.

There are two things you must do to access the memory debugging feature.  First, you must link with the following option -Bddt on Hopper.

% cc -g testDDT.c -Bddt                     # Hopper

Second, when DDT starts, you must click the "Memory Debugging" checkbox in the DDT run menu that first comes up

DDT Groups

To set detailed memory debugging options, click the 'Details...' button on the far right side, which will open the 'Memory Debugging Options' window

DDT Groups

Several features are enabled with memory debugging. To see them, select Current Memory Usage and Memory Statistics under the View menu.  You might see something that looks like the following (for an 8-process MPI run):

By itself this doesn't appear terribly useful but by clicking on one of the pointers in the "Allocation Details" list on the left you can get information mapped to source code:

It is known that memory debugging can fail with the error message "A tree node closed prematurely. One or more proceses may be unusable.", especially with MPI_Bcast. A workaround is to disable 'store stack backtraces for memory allocations' option in the 'Enable Memory Debugging' setting. This problem will be fixed in the next release.

Installed Versions

PackagePlatformCategoryVersionModuleInstall DateDate Made Default
DDT carver applications/ debugging 3.1-20638 ddt/3.1-20638 2012-01-19 2012-01-24
DDT carver applications/ debugging 3.1-22847 ddt/3.1-22847 2012-05-11
DDT euclid applications/ debugging 3.1-22847 ddt/3.1-22847 2012-05-11 2012-05-11
DDT hopper applications/ debugging 3.0 ddt_ccm/3.0 2012-04-05
DDT hopper applications/ debugging 3.1-20638 ddt_ccm/3.1-20638 2012-04-05 2012-04-05
DDT hopper applications/ debugging 3.1-20638 ddt/3.1-20638 2012-01-19
DDT hopper applications/ debugging 3.1-21354 ddt/3.1-21354 2012-02-10
DDT hopper applications/ debugging 3.1-22847 ddt/3.1-22847 2012-05-11

Introductory Video Tutorial