High performance computing (HPC) user facilities are critical to supporting the basic and applied research programs that accomplish the mission of the DOE SC. ASCR operates the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (LBNL) to support the entire spectrum of SC research. NERSC’s mission is to accelerate the pace of scientific discovery by providing advanced HPC, networking, data and support services for SC-sponsored research. SC program managers allocate the vast majority of computing time at NERSC.
Maintaining U.S. leadership in computational science requires the best tools – including a succession of computer resources with world-class capabilities. NERSC’s next system, Cori, to be installed in the new LBNL Computational Research and Theory Facility in 2016, has an expected lifetime of 5 years. Advanced technology systems can take up to a year to transition into production, and in order to avoid a gap in computational resources for SC research, NERSC needs to have a new system delivered in 2020.
The SC supports a broad range of basic research and engineering in energy-related fields and areas of fundamental science. Currently, NERSC supports nearly 6,000 users and over 800 projects using about 600 different application codes. In addition to compute-intensive research, NERSC also supports a large community whose primary focus is scientific discovery through analysis of experimental and observational data. NERSC’s thousands of users also need a machine that can be utilized for the development of the code and algorithm modifications enabling crucial research and development to make exascale computing usable across the broad SC workload.
The Cori system at NERSC pioneered the use of the use of Non-Recurring Engineering (NRE) funds to develop and deploy system functionality that would enhance the user experience and allow us to take full advantage of the new technologies being deployed. The “Burst Buffer” and system power management enhancements that are currently being developed with NRE funds are expected to greatly improve I/O performance as well as the overall manageability of the system. NRE has been critical in tailoring computing technology to the needs of the broader HPC market as well as the SC workload in particular, and building on the results of earlier NRE investments will be a major factor in successfully using the increased computational capabilities projected to be available in the FY2020 timeframe. The NERSC-9 system will include NRE components in it.
The NERSC-9 procurement will be a joint project with the Crossroads Project of the ACES collaboration of Sandia and Los Alamos National laboratories, and is a project of the APEX collaboration between the three labs. The rough timeline for NERSC-9/Crossroads is as follows:
|November 2015||Draft Technical specs released. Benchmarks released.|
|December 2015||Draft technical specs formal feedback from vendors due. Feedback will be requested upon the clarity of the draft and the realism of the stated targets. (Note: informal discussion and feedback is encouraged all the way up to the formal RFP release.)|
|Fall 2016||RFP release - responses due 30 days later.|
|Early 2017||Contract signed, NRE begins.|
Sept 30, 2016: The APEX-2020 RFP has been released. Details of the RFP are at the LANL website here.
Nov 24, 2015: v1.0 of the APEX Workflows document has been released on the APEX website.
Nov 5, 2015: Minor revisions to the workload analysis. The current version number is 1.1.
Nov 2, 2015: A draft of the APEX 2020 technical specification is currently available at APEX website. Please contact the APEX representatives for details.