NERSCPowering Scientific Discovery Since 1974

CUDA Debugging, September 14, 2021

Introduction

NVIDIA will present “CUDA Debugging” on Tuesday, September 14, 2021. This event is a continuation of the CUDA Training Series and will be presented by Robert Crovella from NVIDIA.

When your CUDA codes are not working at all, or not giving you the correct answer, there are a set of techniques to be aware of to tackle any debugging issue. First, we’ll review runtime error-checking best practices. We’ll cover “sticky” vs. “non-sticky” errors and under what situations it’s possible and how to recover from CUDA errors. Next, we’ll take a look at a powerful tool called compute-sanitizer, which is the recommended first debugging tool to dust off. We’ll cover basic usage of the tool as well as how to use the various sub-tools. Finally, we’ll cover use of the cuda-gdb debugger. How to build debug codes, starting the debugger, and how to set breakpoints, single-step, watch variables, inspect memory, and switch thread focus.

Homework will be provided to reinforce the concepts. Cori-GPU access will be provided for current NERSC users for the hands-on exercises. Temporary OLCF Summit access will not be available for remote participants.

Date and Time: 10 am - 12 pm (Pacific time), Tuesday, September 14, 2021

The format of this event will be online only.

Registration

Registration is required for remote participation.  Please click the "Registration" drop down on this page to register.  

Remote Connection Information 

Registration is required for remote participation.  Please click the "Remote Connection Details" drop down on this page to register

Presentation Materials 

  • Slides
  • Recording: TBA (to be posted shortly after the event)
  • Exercises: The example exercises for this module can be found in the "exercises/hw12" folder of this GitHub repo.