NERSCPowering Scientific Discovery Since 1974

Erich Strohmaier

ASCR Requirements Worksheet

1.1. Project Information - Performance Characterization and Benchmarking of HPC Systems

Document Prepared By

Erich Strohmaier

Project Title

Performance Characterization and Benchmarking of HPC Systems

Principal Investigator

Erich Strohmaier

Participating Organizations

tt1

Funding Agencies

DOE SC DOE NSA NSF NOAA NIH Other:

2. Project Summary & Scientific Objectives for 2011-2014

Please give a brief description of your project - highlighting its computational aspect - and outline its scientific objectives for 2011-2104. Please list one or two specific goals you hope to reach by 2014.

tt2

3. Current HPC Usage and Methods

3a. Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)

tt3 

3b. Please list known limitations, obstacles, and/or bottlenecks that currently limit your ability to perform simulations you would like to run. Is there anything specific to NERSC?

tt4 

3c. Please fill out the following table to the best of your ability. This table provides baseline data to help extrapolate to requirements for future years. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions.

Facilities Used or Using

NERSC OLCF ACLF NSF Centers Other:

Architectures Used or Using

Cray XT IBM Power BlueGene Linux Cluster GPUs Other:

Total Computational Hours Used per Year

1 Core-Hours

NERSC Hours Used in 2010

2 Core-Hours

Number of Cores Used in Typical Production Run

3

Wallclock Hours of Single Typical Production Run

4

Total Memory Used per Run

5 GB

Minimum Memory Required per Core

6 GB

Total Data Read & Written per Run

7 GB

Size of Checkpoint File(s)

8 GB

Amount of Data Moved In/Out of NERSC

9 GB per

On-Line File Storage Required (For I/O from a Running Job)

10 TB and 11 Files

Off-Line Archival Storage Required

12 TB and 13 Files

Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.

tt5 

4. HPC Requirements in 2014

4a. We are formulating the requirements for NERSC that will enable you to meet the goals you outlined in Section 2 above. Please fill out the following table to the best of your ability. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions at the workshop.

Computational Hours Required per Year

14

Anticipated Number of Cores to be Used in a Typical Production Run

15

Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above

16

Anticipated Total Memory Used per Run

17 GB

Anticipated Minimum Memory Required per Core

18 GB

Anticipated total data read & written per run

19 GB

 

Anticipated size of checkpoint file(s)

20 GB

Anticipated Amount of Data Moved In/Out of NERSC

21 GB per 22

Anticipated On-Line File Storage Required (For I/O from a Running Job)

23 TB and 24 Files

Anticipated Off-Line Archival Storage Required

25 TB and 26 Files

4b. What changes to codes, mathematical methods and/or algorithms do you anticipate will be needed to achieve this project's scientific objectives over the next 5 years.

tt6

4c. Please list any known or anticipated architectural requirements (e.g., 2 GB memory/core, interconnect latency < 1 μs).

tt7

4d. Please list any new software, services, or infrastructure support you will need through 2014.

tt8 

4e. It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores, perhaps GPUs or other accelerators. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.

tt9 

New Science With New Resources

To help us get a better understanding of the quantitative requirements we've asked for above, please tell us: What significant scientific progress could you achieve by 2014 with access to 50X the HPC resources you currently have access to at NERSC? What would be the benefits to your research field if you were given access to these kinds of resources?

Please explain what aspects of "expanded HPC resources" are important for your project (e.g., more CPU hours, more memory, more storage, more throughput for small jobs, ability to handle very large jobs).

tt10