Glenn K. Lockwood, Kirill Lozinskiy, Lisa Gerhardt, Ravi Cheema, Damian Hazen, Nicholas J. Wright, "Designing an All-Flash Lustre File System for the 2020 NERSC Perlmutter System", Proceedings of the 2019 Cray User Group, Montreal, January 1, 2019,
New experimental and AI-driven workloads are moving into the realm of extreme-scale HPC systems at the same time that high-performance flash is becoming cost-effective to deploy at scale. This confluence poses a number of new technical and economic challenges and opportunities in designing the next generation of HPC storage and I/O subsystems to achieve the right balance of bandwidth, latency, endurance, and cost. In this paper, we present the quantitative approach to requirements definition that resulted in the 30 PB all-flash Lustre file system that will be deployed with NERSC's upcoming Perlmutter system in 2020. By integrating analysis of current workloads and projections of future performance and throughput, we were able to constrain many critical design space parameters and quantitatively demonstrate that Perlmutter will not only deliver optimal performance, but effectively balance cost with capacity, endurance, and many modern features of Lustre.
Glenn K. Lockwood, Kirill Lozinskiy, Lisa Gerhardt, Ravi Cheema, Damian Hazen, Nicholas J. Wright, "A Quantitative Approach to Architecting All-Flash Lustre File Systems", ISC High Performance 2019: High Performance Computing, edited by Michele Weiland, Guido Juckeland, Sadaf Alam, Heike Jagode, (Springer International Publishing: 2019) Pages: 183--197 doi: 10.1007/978-3-030-34356-9_16
New experimental and AI-driven workloads are moving into the realm of extreme-scale HPC systems at the same time that high-performance flash is becoming cost-effective to deploy at scale. This confluence poses a number of new technical and economic challenges and opportunities in designing the next generation of HPC storage and I/O subsystems to achieve the right balance of bandwidth, latency, endurance, and cost. In this work, we present quantitative models that use workload data from existing, disk-based file systems to project the architectural requirements of all-flash Lustre file systems. Using data from NERSC’s Cori I/O subsystem, we then demonstrate the minimum required capacity for data, capacity for metadata and data-on-MDT, and SSD endurance for a future all-flash Lustre file system.
Nicholas Balthaser, Damian Hazen, Wayne Hurlbert, Owen James, Kristy Kallback-Rose, Kirill Lozinskiy, Moving the NERSC Archive to a Green Data Center, Storage Technology Showcase 2020, March 3, 2020,
- Download File: archive-move-NERSC-2020-02-05.pptx (pptx: 22 MB)
Description of methods used and challenges involved in moving the NERSC tape archive to a new data center with environmental cooling.
Kirill Lozinskiy, Glenn K. Lockwood, Lisa Gerhardt, Ravi Cheema, Damian Hazen, Nicholas J. Wright, A Quantitative Approach to Architecting All‐Flash Lustre File Systems, Lustre User Group (LUG) 2019, May 15, 2019,
Kirill Lozinskiy, Lisa Gerhardt, Annette Greiner, Ravi Cheema, Damian Hazen, Kristy Kallback-Rose, Rei Lee, User-Friendly Data Management for Scientific Computing Users, Cray User Group (CUG) 2019, May 9, 2019,
Wrangling data at a scientific computing center can be a major challenge for users, particularly when quotas may impact their ability to utilize resources. In such an environment, a task as simple as listing space usage for one's files can take hours. The National Energy Research Scientific Computing Center (NERSC) has roughly 50 PBs of shared storage utilizing more than 4.6B inodes, and a 146 PB high-performance tape archive, all accessible from two supercomputers. As data volumes increase exponentially, managing data is becoming a larger burden on scientists. To ease the pain, we have designed and built a “Data Dashboard”. Here, in a web-enabled visual application, our 7,000 users can easily review their usage against quotas, discover patterns, and identify candidate files for archiving or deletion. We describe this system, the framework supporting it, and the challenges for such a framework moving into the exascale age.
D. Hazen, J. Hick, W. Hurlbert, M. Welcome, Media Information Record (MIR) Analysis, LTUG 2011, April 19, 2011,
- Download File: NERSCMIRAnalysis2011.pdf (pdf: 5.6 MB)
Presentation of Storage Systems Group findings from a year-long effort to collect and analyze Media Information Record (MIR) statistics from our in-production Oracle enterprise tape drives at NERSC. We provide information on the data collected, and some highlights from our analysis. The presentation is primarily intended to declare that the information in the MIR is important to users or customers to better operating and managing their tape environments.
D. Hazen, J. Hick, HPSS v8 Metadata Conversion, HPSS 8.1 Pre-Design Meeting, April 7, 2010,
Provided information about the HPSS metadata conversion software to other developers of HPSS. Input was important to establishing a design for the version 8 HPSS metadata conversions.
Glenn K. Lockwood, Damian Hazen, Quincey Koziol, Shane Canon, Katie Antypas, Jan Balewski, Nicholas Balthaser, Wahid Bhimji, James Botts, Jeff Broughton, Tina L. Butler, Gregory F. Butler, Ravi Cheema, Christopher Daley, Tina Declerck, Lisa Gerhardt, Wayne E. Hurlbert, Kristy A. Kallback-
Rose, Stephen Leak, Jason Lee, Rei Lee, Jialin Liu, Kirill Lozinskiy, David Paul, Prabhat, Cory Snavely, Jay Srinivasan, Tavia Stone Gibbins, Nicholas J. Wright,
"Storage 2020: A Vision for the Future of HPC Storage",
October 20, 2017,
- Download File: Storage-2020-A-Vision-for-the-Future-of-HPC-Storage.pdf (pdf: 3.6 MB)
Damian Hazen, Jason Hick, "MIR Performance Analysis", June 12, 2012, LBNL LBNL-5896E,
We provide analysis of Oracle StorageTek T10000 Generation B (T10KB) Media Information Record (MIR) Per- formance Data gathered over the course of a year from our production High Performance Storage System (HPSS). The analysis shows information in the MIR may be used to improve tape subsystem operations. Most notably, we found the MIR information to be helpful in determining whether the drive or tape was most suspect given a read or write error, and for helping identify which tapes should not be reused given their history of read or write errors. We also explored using the MIR Assisted Search to order file retrieval requests. We found that MIR Assisted Search may be used to reduce the time needed to retrieve collections of files from a tape volume.
N. Balthaser, D. Hazen, "HSI Best Practices for NERSC Users", May 2, 2011, LBNL 4745E,
- Download File: HSIBestPractices-Balthaser-Hazen-2011-06-09.pdf (pdf: 245 KB)
In this paper we explain how to obtain and install HSI, create a NERSC authentication token, and transfer data to and from the system. Additionally we describe methods to optimize data transfers and avoid common pitfalls that can degrade data transfers and storage system performance.