NERSCPowering Scientific Discovery for 50 Years

Stephen Jardin

FES Requirements Worksheet

1.1. Project Information - Title

Document Prepared By

Stephen Jardin

Project Title

Title

Principal Investigator

Stephen Jardin

Participating Organizations

PPPL, MIT, NYU

Funding Agencies

 DOE SC  DOE NSA  NSF  NOAA  NIH  Other:

2. Project Summary & Scientific Objectives for the Next 5 Years

Please give a brief description of your project - highlighting its computational aspect - and outline its scientific objectives for the next 3-5 years. Please list one or two specific goals you hope to reach in 5 years.

We are simulating device scale instabilities in existing tokamaks and in ITER. The instabilities we are simulating include: internal kink modes (sawtooth oscillations), neoclassical tearing modes (NTMs) and interaction of island chains, edge localized modes (ELMs), disruption forces, runaway electrons, and heat loads during disruptions and vertical displacement events (VDEs), mass redistribution after pellet injection, and energetic particle modes.

3. Current HPC Usage and Methods

3a. Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)

Our primary codes are M3D, M3D-C1, and NIMROD. These all use MPI, although we have development versions that use OpenMP. The codes solver the implicit MHD equations. We can have up to 10^8 degrees of freedom (DOF). Most time is spent solving the sparse matrix equations to advance to the next timestep. 

3b. Please list known limitations, obstacles, and/or bottlenecks that currently limit your ability to perform simulations you would like to run. Is there anything specific to NERSC?

We would benifit from more memory per processor, since some of the codes use Super-LU to perform a sparse matrix factorization as a preconditioner.  

3c. Please fill out the following table to the best of your ability. This table provides baseline data to help extrapolate to requirements for future years. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions.

Facilities Used or Using

 NERSC  OLCF  ACLF  NSF Centers  Other: PPPL 

Architectures Used

 Cray XT  IBM Power  BlueGene  Linux Cluster  Other:  PPPL shared memory machine with 130G

Total Computational Hours Used per Year

3,000,000 Core-Hours

NERSC Hours Used in 2009

 2,000,000 Core-Hours

Number of Cores Used in Typical Production Run

500

Wallclock Hours of Single Typical Production Run

100-1000

Total Memory Used per Run

not known

Minimum Memory Required per Core

 not known

Total Data Read & Written per Run

 30 GB

Size of Checkpoint File(s)

0.5 GB

Amount of Data Moved In/Out of NERSC

 1 GB per  day

On-Line File Storage Required (For I/O from a Running Job)

 TB and  Files

Off-Line Archival Storage Required

 1.5 TB and 30,000 Files

Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.

NA 

4. HPC Requirements in 5 Years

4a. We are formulating the requirements for NERSC that will enable you to meet the goals you outlined in Section 2 above. Please fill out the following table to the best of your ability. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions at the workshop.

Computational Hours Required per Year

40,000,000

Anticipated Number of Cores to be Used in a Typical Production Run

10,000

Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above

500

Anticipated Total Memory Used per Run

 GB

Anticipated Minimum Memory Required per Core

 8-20 GB

Anticipated total data read & written per run

 GB

Anticipated size of checkpoint file(s)

 10 GB

Anticipated Amount of Data Moved In/Out of NERSC

 10 GB per  day

Anticipated On-Line File Storage Required (For I/O from a Running Job)

 TB and  Files

Anticipated Off-Line Archival Storage Required

 20 TB and 30,000 Files

4b. What changes to codes, mathematical methods and/or algorithms do you anticipate will be needed to achieve this project's scientific objectives over the next 5 years.

improvements in data transfer, sparse matrix solvers and associated preconditioners, GPUs for evaluation of matrix elements.

4c. Please list any known or anticipated architectural requirements (e.g., 2 GB memory/core, interconnect latency < 1 μs).

we would benifit from more memory/core and from reduced interconnect latency

4d. Please list any new software, services, or infrastructure support you will need over the next 5 years.

 

4e. It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores, perhaps GPUs or other accelerators. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.

Not really. We rely on PETSc for our sparse solvers, and need to align our data structures with those used by PETSc. 

New Science With New Resources

To help us get a better understanding of the quantitative requirements we've asked for above, please tell us: What significant scientific progress could you achieve over the next 5 years with access to 50X the HPC resources you currently have access to at NERSC? What would be the benefits to your research field if you were given access to these kinds of resources?

Please explain what aspects of "expanded HPC resources" are important for your project (e.g., more CPU hours, more memory, more storage, more throughput for small jobs, ability to handle very large jobs).

All of our calcuations of ITER sized tokamaks are now under-resolved. We are increasing our resolution capabilities both by improved algorithms and by improved hardware. Both are essential.