NERSCPowering Scientific Discovery Since 1974

Jason Hick

Jason Hick
Group Leader
Storage Systems Group
Phone: (510) 486-4851
Fax: (510) 486-4316
1 Cyclotron Road
Mail Stop 943-256
Berkeley, CA 94720 US

Biographical Sketch

Jason Hick holds a B.S. in Computer Science from the United States Military Academy.  From 2001-2006, he worked as a developer in the High Performance Storage System (HPSS) collaboration, and led Los Alamos National Lab’s Data Storage Team managing several multi-PB archives and a site-wide Tivoli Storage Manager (TSM) backup solution.  Jason joined National Energy Research Scientific Computing (NERSC) Center at the Lawrence Berkeley National Lab (LBNL) in 2006 and is currently the Storage Systems Group Lead.  The group has responsibility for providing center-wide file systems, enterprise databases, and archival storage to NERSC users.  He serves as HPSS Technical Committee chairperson and voting representative of SPXXL for LBNL.

Conference Papers

S. Parete-Koon, B. Caldwell, S. Canon, E. Dart, J. Hick, J. Hill, C. Layton, D. Pelfrey, G. Shipman, D. Skinner, J. Wells, J. Zurawski, "HPC's Pivot to Data", Conference, May 5, 2014,


Computer centers such as NERSC and OLCF have traditionally focused on delivering computational capability that enables breakthrough innovation in a wide range of science domains. Accessing that computational power has required services and tools to move the data from input and output to computation and storage. A pivot to data is occurring in HPC. Data transfer tools and services that were previously peripheral are becoming integral to scientific workflows.  Emerging requirements from high-bandwidth detectors, highthroughput screening techniques, highly concurrent simulations, increased focus on uncertainty quantification, and an emerging open-data policy posture toward published research are among the data-drivers shaping the networks, file systems, databases, and overall HPC environment. In this paper we explain the pivot to data in HPC through user requirements and the changing resources provided by HPC with particular focus on data movement. For WAN data transfers we present the results of a study of network performance between centers


Z. Liu, M. Veeraraghavan, Z. Yan, C. Tracyy, J. Tiez, I. Fosterz, J. Dennisx, J. Hick, Y. Lik and W. Yang, "On using virtual circuits for GridFTP transfers", Conference, November 12, 2012,

The goal of this work is to characterize scientific data transfers and to determine the suitability of dynamic virtual circuit service for these transfers instead of the currently used IP-routed service. Specifically, logs collected by servers executing a commonly used scientific data transfer application, GridFTP, are obtained from three US super-computing/scientific research centers, NERSC, SLAC, and NCAR, and analyzed. Dynamic virtual circuit (VC) service, a relatively new offering from providers such as ESnet and Internet2, allows for the selection of a path on which a rate-guaranteed connection is established prior to data transfer. Given VC setup overhead, the first analysis of the GridFTP transfer logs characterizes the duration of sessions, where a session consists of multiple back-to-back transfers executed in batch mode between the same two GridFTP servers. Of the NCAR-NICS sessions analyzed, 56% of all sessions (90% of all transfers) would have been long enough to be served with dynamic VC service. An analysis of transfer logs across four paths, NCAR-NICS, SLAC-BNL, NERSC-ORNL and NERSC-ANL, shows significant throughput variance, where NICS, BNL, ORNL, and ANL are other US national laboratories. For example, on the NERSC-ORNL path, the inter-quartile range was 695 Mbps, with a maximum value of 3.64 Gbps and a minimum value of 758 Mbps. An analysis of the impact of various factors that are potential causes of this variance is also presented.

Neal Master, Matthew Andrews, Jason Hick, Shane Canon, Nicholas J. Wright, "Performance Analysis of Commodity and Enterprise Class Flash Devices", Petascale Data Storage Workshop (PDSW), November 2010,

Sim A., Gunter D., Natarajan V., Shoshani A., Williams D., Long J., Hick J., Lee J., Dart E., "Efficient Bulk Data Replication for the Earth System Grid", Data Driven E-science: Use Cases and Successful Applications of Distributed Computing Infrastructures (Isgc 2010), Springer-Verlag New York Inc, 2010, 435,

Kettimuthu Raj, Sim Alex, Gunter Dan, Allcock Bill, Bremer Peer T., Bresnahan John, Cherry Andrew, Childers Lisa, Dart Eli, Foster Ian, Harms Kevin, Hick Jason, Lee Jason, Link Michael, Long Jeff, Miller Keith, Natarajan Vijaya, Pascucci Valerio, Raffenetti Ken, Ressman David, Williams Dean, Wilson Loren, Winkler Linda, "Lessons Learned from Moving Earth System Grid Data Sets over a 20 Gbps Wide-Area Network", Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing HPDC 10, New York NY USA, 2010, 316--319,

A. Sim, D. Gunter, V. Natarajan, A. Shoshani, D. Williams, J. Long, J. Hick, J. Lee, E. Dart, "Efficient Bulk Data Replication for the Earth System Grid", International Symposium on Grid Computing, 2010,


Prabhat, Q. Koziol, High Performance Parallel I/O, Book, (November 30, 2014)

A. Shoshani, D. Rotem, Scientific Data Management: Challenges, Technology, and Deployment, Book, (December 16, 2009)

This book provides a comprehensive understanding of the latest techniques for managing data during scientific exploration processes, from data generation to data analysis.

Book Chapters

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)


J. Hick, Future Directions and How SPXXL Can Help, SPXXL Summer 2015, May 21, 2015,

Discussion of how NERSC may look in 2020, some challenges to getting there, and a proposal for how the SPXXL user group can help.

J. Hick, R. Lee, R. Cheema, K. Fagnan, GPFS for Life Sciences at NERSC, GPFS User Group Meeting, May 20, 2015,

A report showing both high and low-level changes made to our life sciences workloads to support them on GPFS file systems.

J. Hick, Scalability Challenges in Large-Scale Tape Environments, IEEE Mass Storage Systems & Technologies 2014, June 4, 2014,

Provides an overview of NERSC storage systems and focuses on challenges we experience with HPSS at NERSC and with the tape industry.

Jason Hick, NERSC, Storage Systems: 2014 and beyond, February 6, 2014,

J. Hick, A Storage Outlook for Energy Sciences: Data Intensive, Throughput and Exascale Computing, FujiFilm Executive IT Summit 2013, October 24, 2013,

Provides an overview of the computational and storage systems at NERSC.  Discusses the major types of computation scientists conduct at the facility, the challenges and opportunities the storage systems will face in the near future, and the role of tape technology at the Center.

J. Hick, Storage at a Distance, Open Fabrics Alliance User Day 2013, April 19, 2013,

Presentation to generate discussion on current state-of-the-practice for the topic of storage at a distance and synergy with Open Fabrics Alliance users.

J. Hick, GPFS at NERSC/LBNL, SPXXL Winter 2013, January 7, 2013,

A report to SPXXL conference participants on state of the NERSC Global File System architecture, achievements and directions.

N. Balthaser, J. Hick, W. Hurlbert, StorageTek Tape Analytics: Pre-Release Evaluation at LBNL, LTUG 2012, April 25, 2012,

A report to the Large Tape Users Group (LTUG) annual conference on a pre-release evaluation of the new software product, StorageTek Tape Analytics (STA).  We provide a user's perspective on what we found useful, some suggestions for improvement, and some key new features that would enhance the product.

J. Hick, NERSC Site Update (NGF), SPXXL Winter 2012, January 10, 2012,

Update to NERSC Global File (NGF) System, based on IBM's GPFS, to the SPXXL User Group community.  Includes an overview of NERSC, the file systems that comprise NGF, some of our experiences with GPFS, and recommendations for improving scalability.

M. Cary, J. Hick, A. Powers, HPC Archive Solutions Made Simple, Half-day Tutorial at Super Computing (SC11), November 13, 2011,

Half-day tutorial at SC11 where attendees were provided detailed information about HPC archival storage systems for general education.  The tutorial was the first SC tutorial to cover the topic of archival storage and helped sites to understand the characteristics of these systems, the terminology for archives, and how to plan, size and manage these systems.

J. Hick, Digital Archiving and Preservation in Government Departments and Agencies, Oracle Open World 2011, October 6, 2011,

Attendees of this invited talk at Oracle Open World 2011 heard about the NERSC Storage Systems Group and the HPSS Archive and Backup systems we manage.  Includes information on why we use disk and tape to store data, and an introduction to the Large Tape Users Group (LTUG).

J. Hick, The NERSC Global Filesystem (NGF), Computing in Atmospheric Sciences 2011 (CAS2K11), September 13, 2011,

Provides the Computing in Atmospheric Sciences 2011 conference attendees an overview and configuration details of the NERSC Global Filesystem (NGF).  Includes a few lessons learned and future directions for NGF.

J. Hick, M. Andrews, Leveraging the Business Value of Tape, FujiFilm Executive IT Summit 2011, June 9, 2011,

Describes how tape is used in the HPSS Archive and HPSS Backup systems at NERSC.  Includes some examples of our organizations tape policies, our roadmap to Exascale and an example of tape in the Exascale Era, our observed tape reliability, and an overview of our locally developed Parallel Incremental Backup System (PIBS) which performs backups of our NGF file system.

J. Hick, Storage Supporting DOE Science, Preservation and Archiving Special Interest Group (PASIG) 2011, May 12, 2011,

Provided attendees of the Preservation and Archiving Special Interest Group conference attendees with an overview of NERSC, the Storage Systems Group, and the HPSS Archives and NGF File Systems we support.  Includes some information on a large tape data migration and our observations on the reliability of tape at NERSC.

D. Hazen, J. Hick, W. Hurlbert, M. Welcome, Media Information Record (MIR) Analysis, LTUG 2011, April 19, 2011,

Presentation of Storage Systems Group findings from a year-long effort to collect and analyze Media Information Record (MIR) statistics from our in-production Oracle enterprise tape drives at NERSC.  We provide information on the data collected, and some highlights from our analysis. The presentation is primarily intended to declare that the information in the MIR is important to users or customers to better operating and managing their tape environments.

J. Hick, I/O Requirements for Exascale, Open Fabrics Alliance 2011, April 4, 2011,

This talk provides an overview of the DOE Exascale effort, high level IO requirements, and an example of exascale era tape storage.

D. Hazen, J. Hick, HPSS v8 Metadata Conversion, HPSS 8.1 Pre-Design Meeting, April 7, 2010,

Provided information about the HPSS metadata conversion software to other developers of HPSS.  Input was important to establishing a design for the version 8 HPSS metadata conversions.

J. Hick, Sun StorageTek Tape Hardware Migration Experiences, LTUG 2009, April 24, 2009,

Talk addresses specific experiences and lessons learned in migrating our entire HPSS archive from StorageTek 9310 Powderhorns using 9840A, 9940B, and T10KA tape drives to StorageTek SL8500 Libraries using 9840D and T10KB tape drives.


Richard A. Gerber et al., "High Performance Computing Operational Review: Enabling Data-Driven Scientific Discovery at DOE HPC Facilities", November 7, 2014,

Damian Hazen, Jason Hick, "MIR Performance Analysis", June 12, 2012, LBNL LBNL-5896E,


We provide analysis of Oracle StorageTek T10000 Generation B (T10KB) Media Information Record (MIR) Per- formance Data gathered over the course of a year from our production High Performance Storage System (HPSS). The analysis shows information in the MIR may be used to improve tape subsystem operations. Most notably, we found the MIR information to be helpful in determining whether the drive or tape was most suspect given a read or write error, and for helping identify which tapes should not be reused given their history of read or write errors. We also explored using the MIR Assisted Search to order file retrieval requests. We found that MIR Assisted Search may be used to reduce the time needed to retrieve collections of files from a tape volume. 


J. Hick, J. Hules, A. Uselton, "DOE HPC Best Practices Workshop: File Systems and Archives", Workshop, September 27, 2011,

The Department of Energy has identified the design, implementation, and usability of file systems and archives as key issues for current and future HPC systems. This workshop addresses current best practices for the procurement, operation, and usability of file systems and archives. Furthermore, the workshop addresses whether system challenges can be met by evolving current practices.

D. Cook, J. Hick, J. Minton, H. Newman, T. Preston, G. Rich, C. Scott, J. Shoopman, J. Noe, J. O'Connell, G. Shipman, D. Watson, V. White, "HPSS in the Extreme Scale Era: Report to DOE Office of Science on HPSS in 2018–2022", Lawrence Berkeley National Laboratory technical report LBNL-3877E, 2010, LBNL 3877E,

W. Allcock, R. Carlson, S. Cotter, E. Dart, V. Dattoria, B. Draney, R. Gerber, M. Helm, J. Hick, S. Hicks, S. Klasky, M. Livny, B. Maccabe, C. Morgan, S. Morss, L. Nowell, D. Petravick, J. Rogers, Y. Sekine, A. Sim, B. Tierney, S. Turnbull, D. Williams, L. Winkler, F. Wuerthwein, "ASCR Science Network Requirements", Workshop, April 15, 2009,

ESnet publishes reports from Network and Science Requirement Workshops on a regular basis.  This report was the product of a two-day workshop in Washington DC that addresses science requirements impacting operations of networking for 2009.

A. Mokhtarani, W. Kramer, J. Hick, "Reliability Results of NERSC Systems", Web site, August 28, 2008,

In order to address the needs of future scientific applications for storing and accessing large amounts of data in
an efficient way, one needs to understand the limitations of current technologies and how they may cause system
instability or unavailability. A number of factors can impact system availability ranging from facility-wide
power outage to a single point of failure such as network switches or global file systems. In addition, individual
component failure in a system can degrade the performance of that system. This paper focuses on analyzing both
of these factors and their impacts on the computational and storage systems at NERSC. Component failure data
presented in this report primarily focuses on disk drive in on of the computational system and tape drive failure
in HPSS. NERSC collected available component failure data and system-wide outages for its computational and
storage systems over a six-year period and made them available to the HPC community through the Petascale
Data Storage Institute.

Web Articles

"NERSC Exceeds Reliability Standards With Tape-Based Active Archive", Active Archive Alliance Case Study, February 10, 2012,

"Re-thinking data strategies is critical to keeping up", J. Hick, HPC Source Magazine, June 1, 2010,