Teresa Head-Gordon
Case Study Worksheet
Project Information - Advanced Theoretical Models to Characterize the Alzheimers Abeta Peptide
| Document Prepared By | Teresa Head-Gordon |
|---|---|
| Project Title | Advanced Theoretical Models to Characterize the Alzheimers Abeta Peptide |
| Principal Investigator | Teresa Head-Gordon |
| Participating Organizations | LBNL UC-Berkeley |
| Science Category | Climate Environmental Science Biological Sciences |
| Funding Agencies | DOE SC DOE NSA NSF NOAA NIH Other: |
Project Summary (Scientific Objectives)
Please give a brief description of your project and its scientific objectives for the next 3-5 years.
Alzheimer's is a neurodegenerative disease linked to the aggregation and amyloid fibril formation of a set of short ~40 residue peptides, amyloid beta, which are known to be highly prone to fibrilization in vitro and in vivo. Although early attention focused on the toxicity of the amyloid fibrils as the cause of disease, it is now hypothesized that oligomers (on the order of ~6 peptides) formed during early aggregation are actually the major toxic species. Thus there is a need to develop an understanding of the entire aggregation process that ultimately leads to the specific structure of the final amyloid fibril, starting with the monomer through to these oligomer structures.
Given the possible toxicity of the earlier protofibril states, the focus is now to understand what structural aspects of the ordered fibril is prevalent in the monomer, and ultimately how the Abeta monomers assemble in early phases as proposed by photo-induced cross-linking and into the highly ordered mesoscopic fibril suggested by solid-state NMR experimental models. Our preliminary results
using a coarse-grained model has allowed us to explore many interesting aspects of the mesoscopic protofibril and its critical nucleus. However, the coarse-grained model is not adequate for addressing some molecular questions posed by experiments. We will pursue a first phase of study to answer what
are the sequence attributes and specific molecular interactions that stabilize structure in the monomer in aqueous solution.
We propose to use molecular dynamics simulations combined with accelerated convergence algorithms, with the most recent generations of polarizable protein and water force fields, to characterize structural ensembles and thermodynamics of amyloid beta monomer. We wish to understand whether structure in the monomeric peptide is well-defined enough to promote ordered stable oligomers. The proposed work will contribute to our knowledge of primary sequence and structural factors that ultimately govern the aggregation process in amyloid beta, and should eventually impart the ability to develop new protein engineering strategies for reducing aggregation and therefore disease virulence. Our findings are also expected to impact research in biotechnology, where protein aggregation serves as a bottleneck in the manufacture of pharmaceutical proteins, and in materials science, where amyloid fibrils are being investigated for use as possible nanomaterials
Current HPC Usage and Methods
| Facilities Used |
|
NCCS | ACLF | NSF Centers | Other: |
|---|---|---|---|---|---|
| Architectures Used |
|
|
BlueGene |
|
Other: |
| Total Computational Hours Used per Year | Core-Hours | NERSC Hours Used per Year | 1.05M Core-Hours | ||
| Number of Cores Used in Typical Production Run | 1,664 | Wallclock Hours of Single Typical Production Run | 8 | ||
| Total Memory Used per Run | GB | Minimum Memory Required per Core | GB | ||
| Total Data Read & Written per Run | GB | Size of Checkpoint File(s) | GB | ||
| Amount of Data Moved In/Out of NERSC | GB | How Often | |||
| On-Line File Storage Required (Directly Accesible from a Running Job) | .5 GB | Files | |||
| Off-Line Archival Storage Required | GB | Files | |||
Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.
apack,scalapack,fftw
Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)
AMBER 10
The underlying molecular dynamics engine is a particle-based algorithm, and the code is largely written in Fortran77/90 and uses MPI on most basic applications. There is an optimized suite of code which exploits particle mesh Ewald to give O(NlogN) scaling of energy and force evaluations, and which is parallelized with MPI to efficiently exploit distributed memory architectures. Overlayed on top of this fine-grained parallelization is another layer of (trivial) coarse-grained parallelization involving the replica exchange sampling algorithm, which runs N- independent simulations (each at a different temperatures), that involve infrequent
communication (on the order of milliseconds) to swap state point information (position and velocities of all atoms). An earlier version of the code platform, AMBER9.0, is currently available on Franklin NERSC-LBNL, and thus it is already established that this project will make effective use of the supercomputing facility requested. We have compiled the most recent version of AMBER10.0. NERSC also maintains a page that shows an example of using job steps to accomplish long simulations in smaller blocks on the NERSC queues, and job restarting is well facilitated in AMBER10.0.
Molecular dynamics simulations numerically integrate Newton's equations of motion at very short (~1fs) timesteps in order to evolve a molecular system of interest in time.
n addition to standard double precision arithmetic needed to integrate the equations of motion, the calculation of a the long range electrostatic interactions is achieved via the Particle Mesh Ewald algorithm. This O(N log N) algorithm using a FFT of atomic partial charges interpolated to grid points to determine inverse space Coulomb energies and forces.
Please list the known limitations/obstacles/bottleneck of resources currently available HPC systems, and in particular, those at NERSC.
HPC Usage and Methods for the Next 3-5 Years
Anticipated changes to codes, mathematical methods and/or algorithms needed to achieve this project's scientific objectives.
| Computational Hours Required per Year | ||
|---|---|---|
| Anticipated Number of Cores to be Used in a Typical Production Run | ||
| Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above | ||
| Anticipated Total Memory Used per Run | GB | |
| Anticipated Minimum Memory Required per Core | GB | |
| Anticipated total data read & written per run | GB | |
| Anticipated size of checkpoint file(s) | GB | |
| Anticipated On-Line File Storage Required (Directly Accesible from a Running Job) | GB | Files |
| Anticipated Off-Line Archival Storage Required | GB | Files |
Known or Anticipated architectural requirements (e.g., 2 GB memory/core).
Please list any additional required or important software, services, or infrastructure beyond those listed in the previous section.
It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.
What Do You Need from NERSC?
Please tell us what you need from NERSC to meet your project's computing needs over the next 3-5 years. Also please feel free to make any general comments.


