NERSCPowering Scientific Discovery Since 1974

Peter Cummings

BES Requirements Worksheet

1.1. Project Information - Molecular-Based Simulation of Complex and Nanostructured Fluids

Document Prepared By

Peter Cummings

Project Title

Molecular-Based Simulation of Complex and Nanostructured Fluids

Principal Investigator

Peter Cummings

Participating Organizations

Vanderbilt University

Funding Agencies

 DOE SC  DOE NSA  NSF  NOAA  NIH  Other: DOE/EERE

2. Project Summary & Scientific Objectives for the Next 5 Years

Please give a brief description of your project - highlighting its computational aspect - and outline its scientific objectives for the next 3-5 years. Please list one or two specific goals you hope to reach in 5 years.

We perform primarily molecular dynamics simulations of nanostructured materials and of water/aqueous solutions in bulk, adsorbed on surfaces and under nanoconfinement. Our systems are mid-size – large enough to be challenging on typical local clusters, but not grand-challenge size. We increasingly turn to first principles methods (quantum chemistry, ab initio molecular dynamics) to obtain the parameters we need for the forcefields that are the major input to a molecular dynamics simulation. We are also becoming increasingly interested in modeling reactions at surfaces, either by hybrid methods (combined molecular dynamics and first principles methods) or using classical molecular dynamics with reactive forcefields. In the next 5 years we would like to be confident that reactive force field simulations done in our group will be as accurate as the far-more-time-consuming first principles methods.

3. Current HPC Usage and Methods

3a. Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)

We primarily run the open source molecular dynamics codes LAMMPS, NAMD, and DLPOLY. The system sizes vary from ~20,000 atoms to 150,000 atoms. 

3b. Please list known limitations, obstacles, and/or bottlenecks that currently limit your ability to perform simulations you would like to run. Is there anything specific to NERSC?

The biggest bottleneck is throughput. Our jobs often spend days waiting in the queue.  

3c. Please fill out the following table to the best of your ability. This table provides baseline data to help extrapolate to requirements for future years. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions.

Facilities Used or Using

 NERSC  OLCF  ACLF  NSF Centers  Other: NCCS 

Architectures Used

 Cray XT  IBM Power  BlueGene  Linux Cluster  Other:  

Total Computational Hours Used per Year

 3,500,000 Core-Hours

NERSC Hours Used in 2009

 1M Core-Hours

Number of Cores Used in Typical Production Run

256

Wallclock Hours of Single Typical Production Run

24

Total Memory Used per Run

 2 GB

Minimum Memory Required per Core

 0.01 GB

Total Data Read & Written per Run

 2 GB

Size of Checkpoint File(s)

 1 GB

Amount of Data Moved In/Out of NERSC

2 GB per  day

On-Line File Storage Required (For I/O from a Running Job)

1 GB and  Files

Off-Line Archival Storage Required

 2 GB and  Files

Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.

Debugging, installation of codes 

4. HPC Requirements in 5 Years

4a. We are formulating the requirements for NERSC that will enable you to meet the goals you outlined in Section 2 above. Please fill out the following table to the best of your ability. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions at the workshop.

Computational Hours Required per Year

2,000,000

Anticipated Number of Cores to be Used in a Typical Production Run

1000

Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above

24

Anticipated Total Memory Used per Run

 10 GB

Anticipated Minimum Memory Required per Core

 .1 GB

Anticipated total data read & written per run

200 GB

Anticipated size of checkpoint file(s)

 20 GB

Anticipated On-Line File Storage Required (For I/O from a Running Job)

 1 GB and 50 Files

Anticipated Amount of Data Moved In/Out of NERSC

200 GB per  day

Anticipated Off-Line Archival Storage Required

 20 GB and 500 Files

4b. What changes to codes, mathematical methods and/or algorithms do you anticipate will be needed to achieve this project's scientific objectives over the next 5 years.

We expect to need more first principles calculations, which are far more demanding in computation time and memory than molecular dynamics.

4c. Please list any known or anticipated architectural requirements (e.g., 2 GB memory/core, interconnect latency < 3 #s).

The hardware requirements for molecular dynamics are not likely to be the drivers for any particular architectural features of future machines. If a machine is architected to perform global climate simulations and first principles calculations, it is likely to have the memory/core, interconnect speed, and I/O capabilities that exceed the requirements of molecular dynamics applications.

4d. Please list any new software, services, or infrastructure support you will need over the next 5 years.

We need NERSC to keep updating and implementing community-based codes with good scaling properties (e.g., LAMMPS for molecular dynamics) so that they remain efficient on future architectures. 

4e. It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores, perhaps GPUs or other accelerators. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.

My strategy is to get help from my colleague, Jack Dongarra! 

New Science With New Resources

To help us get a better understanding of the quantitative requirements we've asked for above, please tell us: What significant scientific progress could you achieve over the next 5 years with access to 50X the HPC resources you currently have access to at NERSC? What would be the benefits to your research field if you were given access to these kinds of resources?

Please explain what aspects of "expanded HPC resources" are important for your project (e.g., more CPU hours, more memory, more storage, more throughput for small jobs, ability to handle very large jobs).

Right now we are more cycle-limited than anything else. If we could get 50X as much wallclock time, with the throughput through queues that this would imply, we would be able to be far more productive.