Frank Tsung
Case Study Worksheet
1.1. Project Information - Continuing studies of plasma based accelerators
| Document Prepared By | Frank Tsung |
|---|---|
| Project Title | Continuing studies of plasma based accelerators |
| Principal Investigator | Warren Mori |
| Participating Organizations | University of California, Los Angeles |
| Funding Agencies | DOE SC DOE NSA NSF NOAA NIH Other: |
2. Project Summary & Scientific Objectives for the Next 5 Years
Please give a brief description of your project - highlighting its computational aspect - and outline its scientific objectives for the next 3-5 years. Please list one or two specific goals you hope to reach in 5 years.
For the past 80 years, the tool of choice in experimental high energy physics has been particle accelerators. The Large Hadron Collider (LHC) at CERN came online in 2008. The construction cost alone for the LHC machine is nearly 10 billion dollars and it is clear that if the same technology is used that the world's next "atom smasher" will cost at least several times that in today???s dollars. The long-term future of experimental high-energy physics research using accelerators depends on the successful development of novel ultra high-gradient acceleration methods. New acceleration techniques using lasers and plasmas have already been shown to exhibit gradients and focusing forces more than 1000 times greater than conventional technology, raising the possibility of ultra-compact accelerators for applications in science, industry, and medicine. In plasma based acceleration the coulomb force of a particle beam or the radiation pressure of a laser beam pushes (or pulls) to create a plasma wake that moves near the speed of light. The accelerating gradients in plasma wakefields are more than 1000 times higher than in conventional accelerators. Properly placed particles surf these wakes to ultra-high energies. Plasma-based accelerators has been a fast growing field due to a combination of breakthrough experiments, parallel code developments, and a deeper understanding of the underlying physics of the nonlinear wake excitation in the so-called blowout regime. In a recent PWFA experiment at SLAC, electrons in the tail of a 42 GeV electron beam were accelerated out to ~80 GeV in only 80 cm. This corresponds to greater than 40 GeV/m energy gain for nearly one meter! In recent LWFA experiments at LBNL monoenergetic electron beams at 1GeV have been reported (in a recent experiment by scientists from UCLA and LLNL 1 GeV beams have also been observed). In each case the wakefield was excited in the nonlinear regime in which plasma electrons are radially expelled. Additionally, in the past few years, parallel simulation tools for plasma based acceleration have been verified against each other, against experiment, and against theory.
Based on this progress in experiment, theory, and simulation linear collider concepts using wakefields have been developed and two facilities have been approved. One facility is FACET (at SLAC). This facility will provide 25 GeV electron and positron beams. The other facility is BELLA (at LBNL). It will provide a 30 Joule/ 30 fs laser. The goal for each facility is to experimentally test key aspects of a single cell within the collider concepts. Furthermore, there are other lasers both within the US, and in Europe and Asia that are currently or will be able to experimentally study LWFA in nonlinear regimes.
While some simulations will be conducted to help design and interpret near term experiments, another goal of this proposal is to use these our advanced simulation tools to study parameters that are in regimes that will not be accessible. We will therefore dramatically advance the rate of discovery and progress in plasma-based accelerator research. We are in a unique position as we are the only group in the world with three-dimensional full (OSIRIS) and quasi-static (QuickPIC) particle-in-cell (PIC) codes. The quasi-static algorithm provides a savings of 100-10000 in computer time without loss of accuracy. Because much of the physics involved in plasma-based acceleration is nonlinear such that fluid approaches are not appropriate, PIC modeling is generally necessary. The tools and the experiences of our group puts us in a unique position to make a impact in this field.
3. Current HPC Usage and Methods
3a. Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)
This project uses 2 codes, OSIRIS (a fully explicit EM PIC code using FDTD for the fields and a Boris particle pusher for the particles) which is used to model LWFA problems in the lab frame and the boosted frame, and some PWFA problems, primarily in the (r,z) cylindrical geometry. The 2nd code is QuickPIC, which is a particle-in-cell code with a quasistatic field model for the grid quantities. QuickPIC is used as primarily for PWFA problems, and it can also be used for LWFA problems in systems where self-injection is not an issue (the issue of self injection will be addressed later).
3b. Please list known limitations, obstacles, and/or bottlenecks that currently limit your ability to perform simulations you would like to run. Is there anything specific to NERSC?
Both OSIRIS and QuickPIC has been known to scale for < 2000 CPU's (and OSIRIS has been recently ported to the Jugene computer in Germany and >80% efficiency has been demonstrated for ~300,000 CPU's).
3c. Please fill out the following table to the best of your ability. This table provides baseline data to help extrapolate to requirements for future years. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions.
| Facilities Used or Using | NERSC OLCF ACLF NSF Centers Other: |
|---|---|
| Architectures Used | Cray XT IBM Power BlueGene Linux Cluster Other: |
| Total Computational Hours Used per Year | Core-Hours |
| NERSC Hours Used in 2009 | Core-Hours |
| Number of Cores Used in Typical Production Run | |
| Wallclock Hours of Single Typical Production Run | |
| Total Memory Used per Run | GB |
| Minimum Memory Required per Core | GB |
| Total Data Read & Written per Run | GB |
| Size of Checkpoint File(s) | GB |
| Amount of Data Moved In/Out of NERSC | GB per |
| On-Line File Storage Required (For I/O from a Running Job) | GB and Files |
| Off-Line Archival Storage Required | GB and Files |
Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.
4. HPC Requirements in 5 Years
4a. We are formulating the requirements for NERSC that will enable you to meet the goals you outlined in Section 2 above. Please fill out the following table to the best of your ability. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions at the workshop.
| Computational Hours Required per Year | |
|---|---|
| Anticipated Number of Cores to be Used in a Typical Production Run | |
| Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above | |
| Anticipated Total Memory Used per Run | GB |
| Anticipated Minimum Memory Required per Core | GB |
| Anticipated total data read & written per run | GB |
| Anticipated size of checkpoint file(s) | GB |
| Anticipated On-Line File Storage Required (For I/O from a Running Job) | GB and Files |
| Anticipated Amount of Data Moved In/Out of NERSC | GB per |
| Anticipated Off-Line Archival Storage Required | GB and Files |
4b. What changes to codes, mathematical methods and/or algorithms do you anticipate will be needed to achieve this project's scientific objectives over the next 5 years.
we feel that multi-core architecture will be important and our work on the GPU is discussed later
4c. Please list any known or anticipated architectural requirements (e.g., 2 GB memory/core, interconnect latency < 3 #s).
no
4d. Please list any new software, services, or infrastructure support you will need over the next 5 years.
better parallel I/O,
either on-site visualization or higher speed connection to
4e. It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores, perhaps GPUs or other accelerators. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.
High Performance Computing (HPC) has been dominated for the last 15 years by distributed memory parallel computers and the Message-Passing Interface (MPI) programming paradigm. The computational nodes have been relatively simple, with only a few processing cores per node. This computational model appears to be reaching a limit, with several hundred thousand simple cores in the IBM Blue Gene. The future computational paradigm will likely consist of much more complex nodes, such as Graphical Processing Units (GPUs) or Cell Processors, which can have hundreds of processing cores, with different and still evolving programming paradigms, such as CUDA. One anticipates that new HPC computers, unlike Blue Gene, will consist of a relatively small number (<1,000) nodes, each of which will contain hundreds of cores. To obtain high performance on the node will in most cases require new algorithms. Between nodes, however, it is likely that MPI will continue to be effective.
New Science With New Resources
To help us get a better understanding of the quantitative requirements we've asked for above, please tell us: What significant scientific progress could you achieve over the next 5 years with access to 50X the HPC resources you currently have access to at NERSC? What would be the benefits to your research field if you were given access to these kinds of resources?
Please explain what aspects of "expanded HPC resources" are important for your project (e.g., more CPU hours, more memory, more storage, more throughput for small jobs, ability to handle very large jobs).
The challenge of laser wakefield accelerator modeling lies in the scale gap between the laser wavelength (about 1micron) and the plasma length (from a few millimeters up to several meters). Phenomenological models indicate that increasingly longer plasma columns are required to achieve higher efficiencies and thus higher final output energies of the accelerated particles. As an example, for a 300J laser, to be available in the next few years, up to 5 meters of plasma are needed to reach output beams with 40 GeV. Simulating these scenarios is now possible with the use of the Boosted Frame scheme, which strongly reduces the computational requirements. Although enabling strong computational savings, the simulation of the next generation of lasers is still very demanding, and complete parameter scans have not been performed yet.
Increased HPC resources will not only enable the necessary scans necessary for effective experiment design and interpretation, but will also open the possibility to simulate more advanced regimes and configurations. These computational experiments would focus on reaching higher quality output beams (smaller emmittance and energy spread), and even higher output energies.
As mentioned above, the challenge lies in the scale gap, which is reflected on the large number of algorithm iterations. Therefore, more CPU hours are the important metric for this project.


