NERSCPowering Scientific Discovery Since 1974

Storage and I/O Technologies

Frontiers for Advanced Storage Technologies (FAST)

Working with vendors to develop new functionality in storage technologies generally not yet available to industry.  The NERSC project involves establishing non-disclosure agreements with selected vendors, assessing their hardware, and providing feedback or co-development to improve the product for use in HPC environments.  For more information see [FAST project specifics].

 

NVRAM and Burst Buffer Use Cases 

In collaboration with ACES (The Alliance for Computing at Extreme Scale), and as part of the NERSC-8 procurement, NERSC is evaluating use cases for NVRAM on HPC systems, also known as a 'burst buffer'.  See the Burst Buffer Use Cases document for more information.

 

Boosting HDF5 Performance Through Tuning

The HDF5 library is the third most commonly used software library package at NERSC and the DOE Scientific Discovery through Advanced Computing (SciDAC) program. It is also the most commonly used I/O library across DOE computing platforms. HDF5 is also a critical part of the NetCDF4 I/O library, used by the CCSM4 climate modeling code, a major source of input to the Intergovernmental Panel on Climate Change's assessment reports.

Because parallel performance of HDF5 had been trailing on newer HPC platforms, especially those using the Lustre filesystem, NERSC funded and worked with the HDF Group to identify and fix performance bottlenecks that affect key codes in the DOE workload, and to incorporate those optimizations into the mainstream HDF5 code release so that the broader scientific and academic community can benefit from the work.

NERSC sponsored a workshop to assess HDF5 performance issues and identify strategies for improvement with DOE Office of Science application scientists, Cray developers, and MPI-IO developers. NERSC then initiated a collaborative effort to implement that strategy.  The resulting improvements were:

  • Increased parallel I/O performance by up to 33 times.
  • Raised performance close to the achievable peak of the underlying file system.
  • Achieved 10,000 GB/s write bandwidth (for certain configurations) of both applications.
  • Increased scaling up to 40,960 processors.