NERSCPowering Scientific Discovery for 50 Years

Kai Germaschewski

FES Requirements Worksheet

1.1. Project Information - Center for Integrated Computation and Analysis of Reconnection and Turbulence

Document Prepared By

Kai Germaschewski

Project Title

Center for Integrated Computation and Analysis of Reconnection and Turbulence

Principal Investigator

Amitava Bhattacharjee

Participating Organizations

University of New Hampshire 
Dartmouth College

Funding Agencies

 DOE SC  DOE NSA  NSF  NOAA  NIH  Other: NASA

2. Project Summary & Scientific Objectives for the Next 5 Years

Please give a brief description of your project - highlighting its computational aspect - and outline its scientific objectives for the next 3-5 years. Please list one or two specific goals you hope to reach in 5 years.

The Center for Integrated Computation and Analysis of Reconnection and  Turbulence (CICART) has a dual mission in research: it seeks fundamental advances in physical understanding, and works to achieve these advances by means of innovations in computer simulation methods and theoretical models, and validation by comparison with laboratory experiments and space observations. Our research program has two elements: niche areas in the physics of magnetic reconnection and turbulence which build on past accomplishments of the CICART group and to which the group is well-positioned to contribute, and high-performance computing tools needed to address these topics. The proposed research program of CICART is organized around 
the following six topics: 
 
MAGNETIC RECONNECTION: 
A. Reconnection and secondary instabilities in large, high-Lundquist-number, plasmas 
B. Particle acceleration in the presence of multiple magnetic islands 
C. Gyrokinetic reconnection: comparison with fluid and particle-in-cell models 
TURBULENCE 
D. Imbalanced turbulence 
E. Ion heating 
F. Turbulence in laboratory (including fusion-relevant) experiments 
 
Specific goals: (1) Onset of fast reconnection and its long-time behavior, especially the representation of kinetic effects in fluid codes through closure relations, which aids in global modeling of laboratory and space plasmas. 
(2) Quantitatively predict the results of a novel regime of reconnection in high-density, laser-driven reconnection in the presence of extremely high magnetic fields. 

3. Current HPC Usage and Methods

3a. Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)

A. The Magnetic Reconnection Code (MRC) integrates the fully compressible 3D extended MHD (XMHD) equations in arbitrary curvilinear geometry. A generalized Ohm's Law includes the Hall term and electron pressure gradient, giving rise to dispersive Whistler / kinetic Alfven waves. The equations are discretized using finite-volume / finite-difference on a multi-block structured computational grid. 
The MRC uses PETSc for time integration and provides the Jacobian of the r.h.s., and hence can be run explicitly (RK4/5) and implicitly (Crank-Nicholson) with Newton-direct solver and Newton-Krylov methods. 
 
B. The Particle Simulation Code (PSC) is a fully electromagnetic, 
massively parallel 3D particle-in-cell (PIC) code that includes a collision 
operator. It has a variety of options for boundary conditions (periodic/open/wall/perfectly matched layers). Time integration is explicit, fields are evolved using Faraday's / Ampere's Law. 
 
Both codes are using MPI as the basis for parallelization, work on using SIMD (SSE2,Cell) and SIMT (GPUs) parallelism is underway and shows promising results. 
 
 

3b. Please list known limitations, obstacles, and/or bottlenecks that currently limit your ability to perform simulations you would like to run. Is there anything specific to NERSC?

Fully 3D cylindrical/toroidal XMHD runs are currently not capable of achieving high enough resolution to properly model realistic physical regimes. Due to the geometry, explicit runs suffer from an extremely stringent CFL condition (Delta t propto Delta x^4), and no scalable iterative preconditioner exists. Factorizations do not scale well for large parallel problems. 

3c. Please fill out the following table to the best of your ability. This table provides baseline data to help extrapolate to requirements for future years. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions.

Facilities Used or Using

 NERSC  OLCF  ACLF  NSF Centers  Other:  

Architectures Used

 Cray XT  IBM Power  BlueGene  Linux Cluster  Other:  Cell

Total Computational Hours Used per Year

1500000 Core-Hours

NERSC Hours Used in 2009

 500000 Core-Hours

Number of Cores Used in Typical Production Run

 2048

Wallclock Hours of Single Typical Production Run

 20

Total Memory Used per Run

 1024 GB

Minimum Memory Required per Core

 1 GB

Total Data Read & Written per Run

 1100 GB

Size of Checkpoint File(s)

 1024 GB

Amount of Data Moved In/Out of NERSC

 GB per  

On-Line File Storage Required (For I/O from a Running Job)

 TB and  Files

Off-Line Archival Storage Required

 TB and  Files

Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.

 

4. HPC Requirements in 5 Years

4a. We are formulating the requirements for NERSC that will enable you to meet the goals you outlined in Section 2 above. Please fill out the following table to the best of your ability. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions at the workshop.

Computational Hours Required per Year

 50,000,000

Anticipated Number of Cores to be Used in a Typical Production Run

 50k

Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above

 100

Anticipated Total Memory Used per Run

 50,000 GB

Anticipated Minimum Memory Required per Core

1 GB

Anticipated total data read & written per run

 55,000 GB

Anticipated size of checkpoint file(s)

50,000 GB

Anticipated Amount of Data Moved In/Out of NERSC

 1000 GB per  month

Anticipated On-Line File Storage Required (For I/O from a Running Job)

 50 TB and 1,000,000 Files

Anticipated Off-Line Archival Storage Required

100 TB and 10000 Files

4b. What changes to codes, mathematical methods and/or algorithms do you anticipate will be needed to achieve this project's scientific objectives over the next 5 years.

MRC will need an algorithmically scalable implicit time integration method in 
order to approach realistic parameter regimes for 3D runs, that is, an efficient preconditioner for the Newton-Krylov method. We recently implemented multi-block grid in the underlying framework and are working on porting the discretization to the "butterfly" grid.

4c. Please list any known or anticipated architectural requirements (e.g., 2 GB memory/core, interconnect latency < 1 μs).

4d. Please list any new software, services, or infrastructure support you will need over the next 5 years.

Parallel I/O is a major issue. 

4e. It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores, perhaps GPUs or other accelerators. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.

For fluid codes, we have developed an automatic code generation framework, that will take a description of the discretized equations and will generate tailored code for (currently) plain C, C + SSE2, Cell processor, and (future) GPUs. 
 
For particle-in-cell, we are working on Cell and GPU implementations with promising preliminary results. However, it seems unlikely that there is a generic solution for this class of problems, specific code needs to be written. 
 

New Science With New Resources

To help us get a better understanding of the quantitative requirements we've asked for above, please tell us: What significant scientific progress could you achieve over the next 5 years with access to 50X the HPC resources you currently have access to at NERSC? What would be the benefits to your research field if you were given access to these kinds of resources?

Please explain what aspects of "expanded HPC resources" are important for your project (e.g., more CPU hours, more memory, more storage, more throughput for small jobs, ability to handle very large jobs).

(1) enable the development of scaling relations in 2D for fast reconnection as a function of the Lundquist number, electron-to-ion mass ratio, plasma beta, and system size 
 
(2) identify 3D secondary instabilities that can alter qualitatively the predictions of 2D theory for the range of plasma parameters considered in item (1) 
 
(3) test closure relations for the pressure tensor that will enable the parameterization of kinetic effects in fluid codes 
 
(4) test the predictive capabilities of both the PSC and the HMHD codes in reproducing quantitatively the results of magnetic reconnection experiments in a novel, laser-driven, high-density plasma regime