Performance and Debugging Tools
NERSC provides many popular debugging and profiling tools. Some of them are general-purpose tools and others are geared toward more specific tasks.
A quick guideline on when to use which debugging tool is as follows:
- DDT and TotalView: general purpose parallel debuggers allowing users to interactively control the pace of execution of a program using a graphical user interface
- gdb: serial command line mode debugger; can be useful in quickly examining core files to see where the code crashed (DDT and TotalView can be used for this purpose, too)
STAT: used for obtaining call backtraces for all parallel tasks from a live parallel application and displaying a call tree graphically, showing where each task is executing; useful in debugging a hung application
ATP: used for generating call backtraces for all parallel tasks when a code crashes; useful in debugging a hung application; can be a good starting point if a code crashes with little hint left behind
CCDB and lgdb: the unique and great feature is to run two versions of a code (e.g., one working version and an incorrect version, or a code run with two different numbers of tasks) side by side to find out where the two runs start to generate diverging results
Valgrind: a suite of debugging and profiling tools; the best known tool is memcheck which can detect memory errors or memory leaks; other tools include cache profiling, heap memory profiling tools and more
- Intel Inspector: a memory and threading error checking tool for users developing serial and multithreaded applications on Windows and Linux operating systems.
"Getting Started" tutorials on some debugging tools:
A quick guideline for performance analysis tools below:
- IPM: a low-overhead easy-to-use tool for getting hardware counters data, MPI function timings, and memory usage
- CrayPat: a suite of sophisticated Cray tools for a more detailed performance analysis which can show routine-based hardware counters data, MPI message statistics, I/O statistics, etc; in addition to getting performance data deduced from a sampling method, tracing of certain routines (or library routines) can be performed for better understanding of performance statistics associated with the selected routines
- MAP: a sampling tool for performance metrics; time series of the collected data for the entire run of the code is displayed graphically, and the source code lines are annotated with performance metrics
- Intel VTune Amplifier XE: a GUI-based tool that can find performance bottlenecks
A "Getting Started" tutorial on some performance tools:
For more information about how to use a tool, click on the relevant item below.
Parallel Debugging with lgdb NOTE (Feb., 2016): lgdb is not currently working on Edison and Cori. The problem will be resolved in the next release. lgdb (Cray Line Mode Parallel Debugger) is a GDB-based parallel debugger, developed by Cray. It allows programmers to either launch an application or attach to an already-running application that was launched with aprun, to debug the parallel code in command-line mode. These features can be useful, but you will probably want to use a more powerful… Read More »
Allinea MAP is a parallel profiler with simple GUI. It can be run with up to 512 processors. perf-report is a new tool from Allinea, which may be available for a limited time, that characterizes code performance based on percentage of walltimes used in different performance metrics categories. Read More »