NERSCPowering Scientific Discovery for 50 Years

Introduction to HDF5 for HPC Data Models, Analysis, and Performance: July 27, 2022

July 27, 2022

Introduction

This webinar presented by M. Scot Breitenfeld from the HDF Group is part of the ALCF Developer sessions, and is also open to NERSC users.  

Date and Time: July 27, 2022, 9 -10 am PT
To register for this webinar or to learn more about the speaker, please see the ALCF Developer Sessions event page

Abstract

HDF5 is a data model, file format, and I/O library that became a ​de facto​ standard for HPC applications for achieving scalable I/O and storing and managing big data from computer modeling, large physics experiments and observations. This talk offers a comprehensive overview of HDF5 for anyone who works with big data in an HPC environment. The talk consists of two parts. Part I introduces the HDF5 data model and APIs for organizing data and performing I/O. Part II focuses on HDF5 advanced features such as parallel I/O and will give an overview of various parallel HDF5 tuning techniques such as collective metadata I/O, data aggregation, async, parallel compression, and other new HDF5 features that help to utilize HPC storage to its fullest potential.

Presentation Materials

SlidesVideo