NERSCPowering Scientific Discovery for 50 Years

Frank Tsung

FES Requirements Worksheet

1.1. Project Information - Large Scale Particle-in-Cell Simulations of Laser Plasma Interactions Relevant to Inertial Fusion En

Document Prepared By

Frank Tsung

Project Title

Large Scale Particle-in-Cell Simulations of Laser Plasma Interactions Relevant to Inertial Fusion En

Principal Investigator

Frank Tsung

Participating Organizations

UCLA

Funding Agencies

 DOE SC  DOE NSA  NSF  NOAA  NIH  Other:

2. Project Summary & Scientific Objectives for the Next 5 Years

Please give a brief description of your project - highlighting its computational aspect - and outline its scientific objectives for the next 3-5 years. Please list one or two specific goals you hope to reach in 5 years.

The goal of this project is to use state-of-art particle-in-cell tools (such as OSIRIS and UPIC) to study parametric instabilities under conditions relevant to inertial fusion energy (IFE). These instabilities can absorb, deflect, or reflect the laser, and  generate hot electrons which can degrades compression. However, it is not enough just to eliminate these interactions because in some exotic schemes, such as shock ignition, the fast electrons creates a shock which can produce an ignition and enhance  gain. Therefore, it is critical to gain a thorough understanding of these instabilities instead of simply eliminate them.  Because of the highly nonlinear nature of these instabilities (which includes the interaction between waves and particles and waves and other waves), particle-in-cell codes which are based on first principles are best suited to study them.  
 
The UCLA computer simulation group has a long history of expertise in particle-in-cell  simulations as well as parallel computing. In the past few years, we have applied this expertise to the study of laser plasma interactions. Some of our past accomplishment include: 
 
(i) Used the parallel PIC code osiris to observe (for the first time) the high frequency hybrid instability (HFHI). 
 
(ii) Identified the importance of convective modes in two plasmon decay. 
 
(iii) Shown the importance of plasma wave convections in the recurrence of SRS. 
 
(iv) Found that multi-dimensional plasma waves become localized due to wave  particle effects even in the absence of plasma wave self-focusing.  
 
With NIF (National Ignition Facility) coming online, this is the perfect time to  apply both the expertise of the UCLA group and the HPC resource of NERSC to  study the various LPI's that can be occur under IFE relevant conditions. In the  next 3-5 years, we plan to tackle the following problems at NERSC: 
 
(i) 2D simulations of SRS involving multiple speckles or multiple laser beams. 
 
(ii) Effects of overlapping laser beams for two plasmon/HFHI instabilities near the quarter critical surface. 
 
(iii) Two dimensional studies of SRS/2wp instability under shock ignition relevant conditions.  

3. Current HPC Usage and Methods

3a. Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)

OSIRIS is a fully explicit, multi-dimensional, fully relativistic, parallelized PIC code.  It is written in Fortran95 and takes advantage of advanced object oriented programming  
techniques. This compartmentalization allows for a highly optimized core code and  simplifies modifications while maintaining full parallelization done using domain  decomposition with MPI. There are 1D, 2D, and 3D versions that can be selected at  compile time. In addition, one of OSIRIS’s strongest attributes is the sophisticated  array of diagnostic and visualization packages with interactive GUIs that can rapidly  process large datasets (c.f. visualization section). These  tools can also be used to analyze data generated from our PIC codes. 
 
Recently, we have added dynamic load balancing, perfectly matched layers  absorbing boundary conditions [vay:02], and an optimized version of the higher  order particle shapes [esirkepov:01]. The use of higher order shape functions  combined with current smoothing and compensation can dramatically reduce numerical  heating and improve energy conservation without modifying the dispersion relation of  plasma waves.  
 
OSIRIS also has packages for including physics beyond the standard PIC algorithm.  These include tunnel and impact ionization as well as a binary collision operator.  There are two field ionization models, the ADK model and a simple barrier  suppression model. These algorithms could also be used to model electron  
positron pair creation. Due to the presence of the grids (cells), particles in  PIC codes have finite size and therefore collisions are modified from point particle  collisions, especially when the impact parameter is comparable to the cell size,  typically a Debye length. For smaller impact parameters, the effects of collisions  are greatly reduced in PIC codes. In order to study the effects of collisions for  absolute and not normalized plasma density and temperatures, it is also useful to  explicitly add a Coulomb collision model into the PIC algorithm. We have  
implemented a binary collision module for OSIRIS using both the methods of T.  Takizuka and H. Abe [takizuka:77] and Nanbu [nanbu:97]. We have generalized  these methods for relativistic temperatures, and extended them to handle particles  of different weights (useful, for instance, in a density gradient). The  algorithm has been tested by comparing the relaxation times obtained from  simulations of a two species plasma out of equilibrium. The algorithm was  also extensively tested to guarantee that the proper Jüttner distribution  functions are reached in equilibrium for relativistic temperature. 
 
The code is highly optimized on a single processor, scales very efficiently  on massively parallel computers, and is very easily portable between different  compilers and hardware architectures. To date, it has been ported on Intel, AMD,  and IBM PowerPC, and BlueGene processors running a large variety of operating systems  
(Mac OS X, AIX, Linux, among others). And for each of these platforms, the parallel  scalability has been good regardless of the network configuration. On the Atlas  
machine at LLNL, 80% efficiency was achieved for 4,096CPUs using a fixed size problem  (strong scaling) with significant communication overhead (only 512x512x256 cells and  only 1 billion particles were used). More recently, OSIRIS was ported to the Argonne  BlueGene Intrepid cluster (8,192 quad core nodes, 32,768 processors - www.alcf.anl.gov).  The code is 97% efficient on 32,768 CPU’s with weak scaling and 86% efficient with strong  scaling. 
 
Another code, UPIC, developed by Dr. Viktor Decyk of the UCLA simulation group, is being used as a testbed for the GPU platform. The UCLA Parallel PIC Framework (UPIC) is  a unified environment for the rapid construction of new parallel PIC codes.  It provides trusted components from UCLA’s long history of PIC development, in an  easily accessible form, as well as a number of sample main codes to illustrate how  to build various kinds of codes. UPIC contains support for electrostatic, Darwin,  and fully electromagnetic plasma models, as well as relativistic particles.  

3b. Please list known limitations, obstacles, and/or bottlenecks that currently limit your ability to perform simulations you would like to run. Is there anything specific to NERSC?

OSIRIS scales > 60% on > 64k cores on the Cray XT5 Jaguar. Therefore there is no bottleneck at this point. 

3c. Please fill out the following table to the best of your ability. This table provides baseline data to help extrapolate to requirements for future years. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions.

Facilities Used or Using

 NERSC  OLCF  ACLF  NSF Centers  Other:  LLNL/Atlas

Architectures Used

 Cray XT  IBM Power  BlueGene  Linux Cluster  Other:  

Total Computational Hours Used per Year

 3250000 Core-Hours

NERSC Hours Used in 2009

 0 Core-Hours

Number of Cores Used in Typical Production Run

2048

Wallclock Hours of Single Typical Production Run

100

Total Memory Used per Run

 1200 GB

Minimum Memory Required per Core

 0.6 GB

Total Data Read & Written per Run

 4000 GB

Size of Checkpoint File(s)

 1200 GB

Amount of Data Moved In/Out of NERSC

 GB per  

On-Line File Storage Required (For I/O from a Running Job)

 TB and  Files

Off-Line Archival Storage Required

 TB and  Files

Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.

 

4. HPC Requirements in 5 Years

4a. We are formulating the requirements for NERSC that will enable you to meet the goals you outlined in Section 2 above. Please fill out the following table to the best of your ability. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions at the workshop.

Computational Hours Required per Year

40000000

Anticipated Number of Cores to be Used in a Typical Production Run

100000

Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above

400

Anticipated Total Memory Used per Run

 600000 GB

Anticipated Minimum Memory Required per Core

 6 GB

Anticipated total data read & written per run

 500000 GB

Anticipated size of checkpoint file(s)

 600000 GB

Anticipated Amount of Data Moved In/Out of NERSC

 600000 GB per  month

Anticipated On-Line File Storage Required (For I/O from a Running Job)

 TB and  Files

Anticipated Off-Line Archival Storage Required

 TB and  Files

4b. What changes to codes, mathematical methods and/or algorithms do you anticipate will be needed to achieve this project's scientific objectives over the next 5 years.

In order to perform the calculations described here, subcycling of ions may be required to save CPU time.

4c. Please list any known or anticipated architectural requirements (e.g., 2 GB memory/core, interconnect latency < 1 μs).

the OSIRIS code has shown excellent scaling for > 100,000 cores and we do not expect any new requirement in the near future. One complication is that for simulations which are greater than 100TB, it may be impossible to checkpoint and thus queueing policy will need to change.

4d. Please list any new software, services, or infrastructure support you will need over the next 5 years.

Due to the large memory requirement of future simulations, a higher bandwidth fileserver for I/O and checkpointing will be needed.  

4e. It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores, perhaps GPUs or other accelerators. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.

Viktor Decyk of our group has ported his code UPIC to the GPU. This work, which relies on streaming of data, will also improve performance on other advanced architectures. 

New Science With New Resources

To help us get a better understanding of the quantitative requirements we've asked for above, please tell us: What significant scientific progress could you achieve over the next 5 years with access to 50X the HPC resources you currently have access to at NERSC? What would be the benefits to your research field if you were given access to these kinds of resources?

Please explain what aspects of "expanded HPC resources" are important for your project (e.g., more CPU hours, more memory, more storage, more throughput for small jobs, ability to handle very large jobs).

With a 1 order of magnitude increase, we can study the effects of multiple (in this case, more than 2) beams on the excitation of SRS/2wp instabilities in NIF relevant regimes. However, with 2 order of magnitude increase, we can finally perform full 3D simulations of parametric instabilities using parameters relevant to NIF. Thus higher dimensional effects, such as side loss or wave front bending, can be investigated in full 3D geometry.