NERSCPowering Scientific Discovery Since 1974

HDF5

Description and Overview

Hierarchical Data Format version 5 (HDF5) is a set of file formats, libraries, and tools for storing and managing large scientific datasets. Originally developed at the National Center for Supercomputing Applications, it is currently supported by the non-profit HDF Group.

HDF5 is different product from previous versions of software named HDF, representing a complete redesign of the format and library.  It also includes improved support for parallel I/O. The HDF5 file format is not compatible with HDF 4.x versions. You can use the 'h5toh4' and 'h4toh5' converters that are available on all NERSC machines. 

If you are not familiar with parallel I/O, please refer to the tutorial on scientific I/O.

Using HDF5 On Cray Systems

There are HDF5 libraries provided by Cray . Use the command "module avail cray-hdf5"  to see the available Cray versions. 

Cray serial HDF5 on Edison and Cori

% module load cray-hdf5
% ftn ... (for Fortran code)
% cc ... (for C code)
% CC ... (for C++ code)

Cray parallel HDF5 on Edison and Cori

% module load cray-hdf5-parallel 
% cc ... (for C code)
% ftn ... (for Fortran code)

 

 For questions about HDF5 on any NERSC systems, please send email to consult@nersc.gov. Additional information is available at The HDF Group.

New Features in HDF5 1.10

  • Concurrent Access to an HDF5 File: Single-Writer / Multiple-Reader (SWMR)

  • Virtual Dataset (VDS)

  • Scalable Chunk Indexing

  • Persistent Free File Space Tracking

  • Collective Metadata I/O Feature for Improving Parallel HDF5 Performance

    more information at HDF5 website

Availability at NERSC

  • Cray built versions:
    • Edison: 1.10.0 (default), 1.8.9, 1.8.11, 1.8.12, 1.8.13, 1.8.14, 1.10.0, 1.10.0.1
    • Cori: 1.10.0 (default), 1.8.14, 1.8.16, 1.10.0.1
  • NERSC built versions:
    • Edison: NA
    • Cori: 1.10.1-pre1, 1.10.1(default)

Known issues

  • Can existing HDF5 1.8.16 code read the file generated by HDF5 1.10?

No. Unless when you use HDF5 1.10 to create the files, you set the file format version to be compatible and avoid using latest 1.10 features, e.g., SWMR. How to do that? 

With H5py: 

f = h5py.File(‘name.hdf5’, libver=‘earliest’)’  # create the file with most compatible version, performance benefit from HDF5 1.10 loses         

With HDF5: 

fapl = H5Pcreate (H5P_FILE_ACCESS);
status = H5Pset_libver_bounds (fapl, H5F_LIBVER_EARLIEST, H5F_LIBVER_EARLIEST); // setting the lower and upper bound to be earliest. 
  • Can HDF5 1.10 code read the existing files generated by HDF5 1.8.16?

YES

  • HDF5 1.10.0 file locking issue (Fixed with 1.10.1 (available with NERSC built version) but need to set one env variable)

HDF5 1.10.x which has the SWMR feature, and enables flock to allow multiple applications opening the same file. The flock, however, is not enabled by certain filesystem at NERSC, e.g., /project, burst buffer, so the fix is to disable the environmental variable HDF5_USE_FILE_LOCKING by setting it to be FALSE. The flock is enabled on lustre file system, i.e., SCRATCH, so you won't see any flock error there. Note that Cray's default version isn't upgraded to support this environmental variable, so you have to compile your code with NERSC-built 1.10.1. 

export HDF5_USE_FILE_LOCKING=FALSE         
  • HDF5 1.10 conflicts with Darshan, Resolved with Darshan 3.1.4 (Users don't need to do anything now)

When using the latest version of HDF5 on Cori/Edison, Darshan should be unloaded. The HDF5 1.10.x has updated the type of hid_t from 32 bit integer to 64 bit integer, but Darshan's HDF5 wrapper still uses 32 bit.

Darshan 3.1.4 is installed, and has disabled the HDF5 wrappers, so Darshan will not crash applications, will not profile HDF5 1.10 functions calls, but will profile MPI and POSIX calls, which is useful.  

PackagePlatformCategoryVersionModuleInstall DateDate Made Default
HDF datatran2 libraries/ I/O 1.8.13 hdf5/1.8.13 2015-09-30
HDF pdsf_sl6 libraries/ I/O 1.8.13 hdf5/1.8.13 2014-08-14 2014-08-14
hdf5 genepool pe_libraries/ general 1.8.10-patch1 hdf5/1.8.10-patch1 2013-03-28 2013-04-08
hdf5 genepool pe_libraries/ general 1.8.11 hdf5/1.8.11 2013-06-11
hdf5 genepool pe_libraries/ general 1.8.12 hdf5/1.8.12 2014-01-14
hdf5 genepool pe_libraries/ general 1.8.13 hdf5/1.8.13 2014-09-24
hdf5 genepool pe_libraries/ general 1.8.15-patch1 hdf5/1.8.15-patch1 2015-11-09
hdf5 genepool pe_libraries/ general 1.8.4-patch1 hdf5/1.8.4-patch1 2013-04-17
HDF5 genepool applications/ bioinformatics 1.8.7 hdf5/1.8.7 2012-04-04 2012-04-04
hdf5 genepool pe_libraries/ general 1.8.9 hdf5/1.8.9 2012-07-19 2012-07-19
hdf5 genepool libraries/ general 4.10.2 rpm/4.10.2 2013-01-24 2013-01-24
hdf5 genepool_sl6 pe_libraries/ general 1.8.10-patch1 hdf5/1.8.10-patch1 2014-12-12 2014-12-12
hdf5 genepool_sl6 pe_libraries/ general 1.8.12 hdf5/1.8.12 2014-12-12
hdf5 genepool_sl6 pe_libraries/ general 1.8.13 hdf5/1.8.13 2014-12-12
hdf5 genepool_sl72 pe_libraries/ general 1.8.15-patch1 hdf5/1.8.15-patch1 2017-01-27
hdf5 phoebe pe_libraries/ general 1.8.10-patch1 hdf5/1.8.10-patch1 2013-03-29 2013-06-26
hdf5 phoebe pe_libraries/ general 1.8.11 hdf5/1.8.11 2013-06-18
hdf5 phoebe pe_libraries/ general 1.8.4-patch1 hdf5/1.8.4-patch1 2013-05-15
hdf5 phoebe pe_libraries/ general 1.8.9 hdf5/1.8.9 2012-07-26 2012-07-26
hdf5 phoebe libraries/ general 4.10.2 rpm/4.10.2 2013-05-15 2013-06-26