GPU Profiling (Performance Profile: Omniperf): Part 5 of HIP Training Series
AMD presents a multi-part HIP training series intended to help new and existing GPU programmers understand the main concepts of the HIP programming model. HIP® is a parallel computing platform and programming model that extends C++ to allow developers to program GPUs with a familiar programming language and simple APIs. Each part of this training includes a one-hour presentation and example exercises. The exercises are meant to reinforce the material from the presentation and can be completed during a one-hour hands-on session following each lecture.
This training series is open to OLCF and NERSC users via Zoom. OLCF users will be using HIP for AMD GPUs on Frontier. NERSC users will be using HIP for Nvidia GPUs on Perlmutter. Please note that participants will register for each part of the series individually. The exercises for the hands-on portion can be found in this GitHub repository. The Q&A for all the sessions will be in this Google doc.
Part 5: GPU Profiling (Performance Profile: Omniperf)
10 a.m. - 12 p.m. (Pacific Daylight Time/UTC -7), Monday, October 16, 2023
Collecting and presenting data on the performance of kernels can help identify key optimizations. AMD’s rocprof profiler collects the basic hardware counter data to enable this profiling. Omniperf adds to this hardware counter data many derived metrics and presents it in a form that application developers can use to tune their kernels and applications for top performance. The kernel profiling data can help determine which kernels to address by showing which kernels take the most time. Performance can be visualized on a roofline plot and compared against the peak possible performance for the hardware. Hands-on exercises will generate performance profiles, implement optimizations and then generate a performance report that shows the differences between the optimized and unoptimized performance profiles.
Register online for this remote-only event.