Panagiotis Spentzouris
HEP Case Study Worksheet
1.1. Project Information - Community Petascale Project for Accelerator Science and Simulation (ComPASS)
|
Document Prepared By |
Panagiotis Spentzouris |
|
Project Title |
Community Petascale Project for Accelerator Science and Simulation (ComPASS) |
|
Principal Investigator |
Panagiotis Spentzouris |
|
Participating Organizations |
Argonne National Laboratory, |
|
Funding Agencies |
DOE SC DOE NSA NSF NOAA NIH Other: |
2. Project Summary & Scientific Objectives for the Next 5 Years
Please give a brief description of your project - highlighting its computational aspect - and outline its scientific objectives for the next 3-5 years. Please list one or two specific goals you hope to reach in 5 years.
Particle accelerators are critical to scientific discovery in the DOE program in America and indeed the world. The development and optimization of accelerators is essential for advancing our understanding of the fundamental properties of matter, energy, space, and time, and for enabling research in materials sciences, chemistry, geosciences, and aspects of biosciences.
The High Energy Physics (HEP) program uses accelerators to answer fundamental questions about nature such as the origin of mass and the asymmetry between matter and antimatter and to search for new particles, new symmetries, and possible extra dimensions of space. In the DOE 15-year plan for HEP, the first two action items call for full support of the program of the Large Hadron Collider (LHC) at CERN and for the establishment of leadership in the R&D effort to design and build the proposed International Linear Collider (ILC) on U.S. soil. Even with the current HEP budget difficulties, the recent report of the Particle Physics Project Prioritization Panel (P5) emphasizes in its recommendations the need to maintain leadership in both the energy and the intensity frontier of accelerator science. At the same time, it is imperative to maximize the physics reach of the ongoing DOE/HEP program, and that involves the performance optimization of the Fermilab Tevatron. Furthermore, DOE/HEP is supporting a world-class R&D program to develop new accelerator technologies including laser wakefield and plasma wakefield accelerators, as well as other types of advanced accelerator concepts.
Under SciDAC1, AST, the predecessor project to ComPASS, produced a powerful suite of parallel simulation tools representing a paradigm shift in computational accelerator science. Simulations that used to take weeks or more now take hours, and simulations once thought impossible are now performed routinely. A lot of these successful applications utilized NERSC facilities and their development benefited from the NERSC infrastructure.
Because of the complexity, precision, and beam intensity requirements of next generation accelerators, our paradigm has to change from single machine, single-component simulations to end-to-end, multi-physics simulations. In FY09, ComPASS will continue to develop applications in a comprehensive, integrated accelerator simulation environment. These applications include large-scale electromagnetic modeling of SRF cavities (ILC design) for the Fermilab proton driver (Project-X) design, with realistic cavity shapes and misalignments; assessment of the impact of wakefields on beam dynamics; and multiphysics, multi-bunch modeling of the Fermilab Main Injector and Booster, for performance optimization under the current and Project-X operating conditions. We will also focus on design optimization of accelerator components with complicated geometries such as the LHC crab cavity, which includes couplers with very fine features. We will also perform beam-beam and electron-cloud simulations to help understand and optimize LHC machine performance. Our applications emphasize the interaction of beam dynamics and electromagnetics codes. In addition, the project will assist the development of advanced accelerator concepts. We will provide real-time or near-real-time feedback between simulation and advanced accelerator experiments.
3. Current HPC Usage and Methods
3a. Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)
The ComPASS project funds the development of three broad categories of codes for: a) machine design and optimization, b) component design and optimization, c)support of new accelerator techniques and technologies. The last two categories are covered in the other case study sheets, so here we focus on (a). Main codes are Synergia, ML/Impact (both multi-physics frameworks), BeamBeam3D and NIMZOVICH (single purpose codes). The codes utilize electrostatic particle-in-cell model with structured grids, with different strategies and solver implementations:
a. Depending on the physics of the problem, the codes might use domain decomposition, particle decomposition, or hybrid decomposition. There may be communication of particle data, grid data, or both. Particle movement between Poisson solves may be slight or large, hence, some codes use a particle manager and some do not use a particle manager.
b. Solvers. Our codes utilize spectral based, finite difference based, and hybrid descritisations with FFT and multi-grid based solvers.
Depending on the type of algorithm, we have different grid size limitations (due to memory requirements): typical large grid 1024^3 for the first scheme (both particles and grids distributed), 256^3 for the second (just grid). This results to requirements for 10 to 100M macroparticles, depending on the type of simulation. Paralelism is expressed using MPI.
3b. Please list known limitations, obstacles, and/or bottlenecks that currently limit your ability to perform simulations you would like to run. Is there anything specific to NERSC?
Support of shared libraries (for framework applications).
3c. Please fill out the following table to the best of your ability. This table provides baseline data to help extrapolate to requirements for future years. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions.
|
Facilities Used or Using |
NERSC OLCF ACLF NSF Centers Other: Local Development Clusters |
|
Architectures Used |
Cray XT IBM Power BlueGene Linux Cluster Other: |
|
Total Computational Hours Used per Year |
10M Core-Hours |
|
NERSC Hours Used in 2009 |
2.5M Core-Hours |
|
Number of Cores Used in Typical Production Run |
1k-10k (application dependent) |
|
Wallclock Hours of Single Typical Production Run |
48 |
|
Total Memory Used per Run |
1.5 to 16 GB |
|
Minimum Memory Required per Core |
0.5 to 2 GB |
|
Total Data Read & Written per Run |
100 GB |
|
Size of Checkpoint File(s) |
1 GB |
|
Amount of Data Moved In/Out of NERSC |
100 GB per week |
|
On-Line File Storage Required (For I/O from a Running Job) |
1 GB and 100000 Files |
|
Off-Line Archival Storage Required |
10 GB and 200000 Files |
Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.
Functional parallel hdf5; shared library versions of standard libraries such as fftw, hdf5, etc.
4. HPC Requirements in 5 Years
4a. We are formulating the requirements for NERSC that will enable you to meet the goals you outlined in Section 2 above. Please fill out the following table to the best of your ability. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions at the workshop.
|
Computational Hours Required per Year |
3M |
|
Anticipated Number of Cores to be Used in a Typical Production Run |
10k-100k |
|
Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above |
48-72 |
|
Anticipated Total Memory Used per Run |
2 to 32 GB |
|
Anticipated Minimum Memory Required per Core |
0.5 to 2 GB |
|
Anticipated total data read & written per run |
200 GB |
|
Anticipated size of checkpoint file(s) |
2 GB |
|
Anticipated On-Line File Storage Required (For I/O from a Running Job) |
2 GB and 10000 Files |
|
Anticipated Amount of Data Moved In/Out of NERSC |
200 GB per week |
|
Anticipated Off-Line Archival Storage Required |
20 GB and 200000 Files |
4b. What changes to codes, mathematical methods and/or algorithms do you anticipate will be needed to achieve this project's scientific objectives over the next 5 years.
Algorithmic impovents to allow for longer time steps, solver strong scaling improvements.
4c. Please list any known or anticipated architectural requirements (e.g., 2 GB memory/core, interconnect latency < 3 #s).
2GB memory per core or better.
4d. Please list any new software, services, or infrastructure support you will need over the next 5 years.
shared libraries, better queue organization for development and test jobs (that reduces dependence to local development resources). Utilization of ensemble runs with workflow for design parameter optimization will require support
for error detection and recovery.
4e. It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores, perhaps GPUs or other accelerators. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.
We are currently starting our research program on understanding how to effectively utilize GPUs. Our applications (for machine design and optimization) have two main components: particle tracking and field soves. Our efforts todate have demonstrated that we can do efficient tracking with high-order-optics on GPUs. We are investigating field solves on GPUs and hybrid schemes involving a misture of conventional procs and GPUs. We will need more information on the architecture of the future machines incorporating GPUs in order to design efficient multi-level parallelism schemes.
New Science With New Resources
To help us get a better understanding of the quantitative requirements we've asked for above, please tell us: What significant scientific progress could you achieve over the next 5 years with access to 50X the HPC resources you currently have access to at NERSC? What would be the benefits to your research field if you were given access to these kinds of resources?
Please explain what aspects of "expanded HPC resources" are important for your project (e.g., more CPU hours, more memory, more storage, more throughput for small jobs, ability to handle very large jobs).
Access to 50X resources will allow us to:
a) deploy multi-scale, multi-physics beam dynamics simulations to predict beam loss and resulting activation in Intensity Frontier accelerators covering the full range of scales relevant to the problem, from 10^-3 m beams, to 10 m wakefields, to many 10^3 m propagation. Such simulations will be of significant importance for the design and operation of the short and mid-term FNAL future plans.
b) deploy multi-scale, multi-physics beam dynamics simulations to help maximize luminosity in Energy Frontier accelerators. Such simulations will be important for helping maximize the output of the last years of the Tevatron, help diagnose potential LHC problems, and contribute to the design of the next generation lepton collider.
Importance of "expanded HPC resources": more CPU hours, more throughput for small jobs, more memory.


