NERSCPowering Scientific Discovery Since 1974

NERSC Initiative for Scientific Exploration (NISE) 2011 Awards


Summary of the 2011 NISE awards.

NISE is a mechanism used for allocating the NERSC reserve (10% of the total allocation). It is a competitive allocation administered by NERSC staff and management.  Criteria used in 2011 are:

  • A new research area not covered by the existing ERCAP proposal: this could be a tangential research project or a tightly coupled supplemental research initiative.
  • New programming techniques that take advantage of multicore compute nodes by using OpenMP, Threads, UPC or CAF: this could include modifying existing codes, creating new applications or testing the performance and scalability of multicore programming techniques.
  • Developing new algorithms that increase your ability to do your science (e.g. run at higher scale, incorporate new physics)

Awarded projects for 2011 as of March are listed on the next pages.

New potential energy surface for ozone molecule

Dmitri Babikov, Marquette University

Associated NERSC Project: Coherent Control of the Ground Electronic State Molecular Dynamics (m409)

NISE Award: 1,450,000 Hours
Award Date: March 2011

New accurate potential energy surface of ozone will allow computational studied of spectroscopy and dynamics of this molecule, important for environment. In the upper atmosphere, stratosphere of our planet, ozone plays crucial role in protecting life on Earth from dangerous ultra-violet radiation from the sun. In the lower atmosphere, troposphere, ozone is the main component of the photochemical pollution in industrial areas (smog). We should try to reduce levels of tropospheric ozone, preserving the stratospheric ozone.

A very accurate ab initio tools must be employed, such as MRCI level calculations with CASSCF for the electronic structure of this molecule, with at least six electrons of each oxygen explicitly correlated. Very large basis sets are needed, up to sixtaple-zeta, and extrapolation of each point to the CBS limit should be performed. Without this, the dissociation energy of ozone can’t be reproduced accurately. The van der Waals region of ozone has shallow wells, and those are expected to play major role in the ozone formation dynamics. These wells must be accurately captured by the calculations. Furthermore, the spin-orbit interaction in the asymptotic region must be taken into account. The value of spin-orbit interaction is comparable in magnitude to depth of vdW wells. Both kinds of interaction need to be treated accurately at the same time.

The ab initio points is one component of the surface development, but choosing right strategy for fitting or splining the data is another important issue. The standard procedure of splining is too demanding and is out of question. We have in our tool box a fitting method based on symmetry invariant polynomials. This method offers very significant advantages for molecules with identical atoms, like ozone. The ozone is a perfect system to treat with this relatively new approach.

Hybrid OpenMP/MPI approach to R-matrix scattering

Connor Balance, Auburn University

Associated NERSC Project: Computational Atomic Physics for Fusion Energy (m41)
Principle Investigator: Michael Pindzola, Auburn University

NISE Award: 600,000 Hours
Award Date: June 2011

The R-matrix/RMPS method has been employed successfully for a wide range of electron-impact excitation/ionisation cases as well as single-photon ionisation. One of the most time demanding aspects of each calculation is the formation of the R-matrix itself, which must be carried out for each incident electron/photon energy. This is effectively achieved by use of DGEMM for a user defined number of slices, with the MPI gathering and mapping the result to a global matrix. Currently, this is achieved via multi-level MPI parallelisation, but we would like to investigate an OpenMP/MPI approach. In this study, the energies of the incoming electron/photon would still be distributed over a number of nodes using MPI, but the formation of the R-matrix shall be carried out via OpenMP utilizing the cores of a particular node. It is hoped that by confining the intensive communication between cores to within an node, that this shall provide an additional speed-up our code base, allowing us to run at a higher scale in future calculations.

Global Effects on the Dynamics of Plasmoids and Flux Ropes During Magnetic Reconnection

Amitava Bhattacharjee, University of New Hampshire

Associated NERSC Project: Center for Integrated Computation and Analysis of Reconnection and Turbulence (m148)

NISE Award: 1,000,000 Hours
Award Date: March 2011

Magnetic reconnection drives some of the most dramatic, explosive energy releasing processes in the solar system, such as solar flares and coronal mass ejections. These violent events rapidly convert huge amounts of stored magnetic energy into heat and kinetic energy. The same type of reconnection process that governs these explosions is also widely believed to govern the onset of magnetospheric substorms, as well as sawtooth crashes in tokamaks. One of the main focus areas of reconnection research is to answer the following two questions: "How does reconnection happen so rapidly?" and "What triggers onset of fast reconnection?" Recent progress in this area suggests that a rapidly growing plasmoid instability may play an important role in triggering the onset of fast reconnection. This instability breaks up a long thin current sheet into a chain of plasmoids (or magnetic flux ropes in three dimensions) connected by even thinner current sheets. So far, studies of the plasmoid instability have largely focused on two-dimensional, highly idealized systems, often with periodic boundary conditions. Furthermore, reconnection itself is often considered as a localized event, and any feedback with boundary conditions are neglected. As such, the applicability of these results to real systems remains an open question. The principal objective of our proposed research is to bring the study a step closer to reality, by incorporating boundary effects and three-dimensional geometry into simulations.  In many physical systems of interest, magnetic field lines are anchored in much denser plasmas such as the solar photosphere or the astrophysical accretion disk, which may be treated as a boundary condition to the reconnecting system.  This so-called line-tied boundary condition plays a dual role in magnetic reconnection. On the one hand, it may facilitate the formation of thin current sheets via shuffling the footpoints of magnetic field lines, thereby enabling magnetic reconnection to occur. On the other hand,  the line-tying effect is known to be stabilizing for various plasma instabilities, including the tearing mode, which is the mechanism underlying the plasmoid instability.  Consequently, boundary conditions are expected to significantly alter previous results obtained in simple 2D configurations. We plan to study the interplay between the global geometry and reconnection through large scale 2D and 3D simulations.  In particular, we propose to address the following questions: How does the global configuration affect reconnection? And how do the effects of reconnection exercise feedback on the global configuration of the magnetic field?  Answers to these questions  will allow us to further assess the role of the plasmoid instability in triggering the onset on fast reconnection in the context of systems with real, physical boundaries.

MD Simulations of Liquid Water Films at the Boundary Between Gas Hydrate and Mineral Surfaces

Ian C. Bourg, Lawrence Berkeley National Lab

Associated NERSC Project: Clay Mineral Surface Geochemistry (mp47)
Principal Investigator: Garrison Sposito, Lawrence Berkeley National Lab

NISE Award: 300,000 Hours
Award Date: March 2011

Liquid water films at gas hydrate grain boundaries (between two hydrate grains or between a hydrate grain and a mineral surface) have been hypothesized to strongly influence the kinetics of dissolution and growth in gas hydrate formations. Therefore, by establishing the existence (or absence) of these films and their properties, our simulations will provide important information on the behavior of gas hydrates in ocean sediments (in particular, on the feasibility of sequestering CO2 as gas hydrates in ocean sediments.

The proposed research will investigate the existence and properties of liquid water films at the boundary between a clay mineral surface and a CO2 hydrate clathrate. Such liquid water films (known as “premelted” water films) are known to exist at boundaries between mineral surfaces and ice at temperatures lower than the freezing temperature of ice. Their existence in gas hydrate clathrates has been hypothesized but not yet demonstrated.

In order to determine whether stable liquid water films exist at boundaries between mineral surfaces and gas hydrates, we shall carry out two molecular dynamics (MD) simulations. Our first simulation will consist in placing a CO2 hydrate crystal in contact with a smectite clay mineral surface at a pressure characteristic of gas hydrate formations in ocean sediments (P > 4 MPa [3]) and at a temperature 10 K below the freezing temperature of the gas hydrate. Our second simulation will consist in placing a 6-nm-thick liquid water film (containing dissolved CO2 at the same H2O/CO2 ratio as the gas hydrate) between a CO2 hydrate crystal and a smectite clay mineral surface at the same pressure and temperature as our first simulation. Based on the known behavior of ice near glass and metal surfaces [1], we expect that the equilibrium thickness of the “premelted” liquid water film in these conditions should be on the order of 1.5 to 3 nm. If this is true, the gas hydrate crystal will melt near the clay mineral surface in our first simulation, but will grow towards the clay surface in our second simulation. Previous simulations of gas hydrate growth or dissolution indicate that simulation times of a few hundred nanoseconds will be sufficient to approach the equilibrium thickness of the liquid water film in our simulations.

The proposed research will use our tested MD simulation methodologies for describing water-clay and water-CO2 interactions. We also shall test water models that accurately describe the freezing temperature of liquid water but for which well-tested mineral-water interaction potentials have not yet been developed, such as the TIP5P or TIP4P-ICE water models.

Systems Biology Knowledge Base

Shane Canon, Lawrence Berkeley National Lab

NISE project kbase

NISE Award: 100,000 Hours
Award Date: September 2011

Kbase will support open community science by serving as a freely available computational environment for sharing and integrating diverse biological data types, accessing and developing software for data analysis, and providing resources for modeling and simulation. It will leverage community-wide capabilities, experimental results, and modeling efforts and bring together research products from many different projects and laboratories to create an extensible, comprehensive cyberinfrastructure focused on DOE scientific objectives related to microbes, plants, and metacommunities (complex communities of organisms).

Sparse Lanczos and GMRES Solvers in Co-Array Fortran (CAF)

Pierre Carrier, University of Minnesota - Twin Cities

Associated NERSC Project: Pseudopotential Algorithm for Real Space Electronic Calculations (m684)
Principal Investigator: Yousef Saad, University of Minnesota

NISE Award: 80,000 Hours
Award Date: March 2011

The goal of this research is to implement a co-array fortran (CAF) sparse Lanczos computer program. A first implementation of the Lanczos algorithm will be applied to eigenvalue problems, the standard application used for Lanczos, and will take full advantage of the CAF language. A typical Lanczos algorithm requires sparse matrix-vector operations, partial or full re-orthogonalization of the Lanczos vectors (i.e., Gram-Schmidt), and multiple dot products [implying communication between processors for a reduction.

This new algorithm adds to a standard Lanczos algorithm an iterative solver based on preconditioned-GMRES, and a domain decomposition. Domain decomposition is perfectly well-suited for a parallel CAF environment of this algorithm, as described in this talk.

Once the structure of the standard Lanczos, the new Low-rank Lanczos, and the GMRES algorithms are completed using CAF, several reduction [similar to MPI_Reduce(SUM)] will be tested for the dot-products (e.g., tree-like reductions, such as the "butterfly" algorithm as used by FFT's), especially necessary in the more costly re-orthogonalization part of Lanczos.

Very few algorithms using CAF have been developed so far. CAF is part of the new fortran08 compiler, and it is thus imperative that computer programs based on this language become available to the scientific community. The Lanczos algorithm is used in many fields of quantum physics (e.g., for solving the Schroedinger equation, and recently for solving the Dyson equation, etc), as well as in computer science, in data mining and pattern recognition. The GMRES algorithm is at the heart of iterative linear system solves.

Numerical Study of Mode Conversion and Ion Heating in Laboratory-Relevant Plasmas

Paul Cassak, West Virginia University

Associated NERSC Project: Three Dimensional and Diamagnetic Effects on Magnetic Reconnection (m866)

NISE Award: 200,000 Hours
Award Date: March 2011

One effect of sound waves propagating through air is the transport of energy from one location to another. Waves in plasmas (gases that are so hot they are ionized) also transport energy. However, there are additional waves that can be excited in plasmas compared to waves in the air (because of the importance of electromagnetic fields), including so-called Alfven waves. When a wave propagating through a plasma reaches a region with a different plasma density, the wave can “mode convert” into a different type of wave. This mode conversion can lead to nonlinear interactions between modes, which can stimulate wave-particle interactions and ultimately transfer energy from the wave into the ionized particles making up the plasma. This behavior is ubiquitous in naturally occurring and experimental plasmas. Mode conversion and ion heating occur in laboratory studies of fusion in toroidal confinement devices (tokamaks), which makes this topic important for renewable energy production. It is also known or thought to occur in the Earth’s magnetosphere between adjacent ionized layers of differing density, and this type of energy transport has been cited as a possible mechanism for heating the solar atmosphere (the corona) to 200 times the temperature of the solar surface.

Mode conversion and ion heating is currently being studied in laboratory experiments at West Virginia University, the PI's home institution. A high density helicon source, which maintains a strong density gradient, excites waves in a magnetized plasma. Previous experiments identified that mode conversion and ion heating occurs. The proposed simulation study will support these ongoing experimental investigations. We will simulate mode conversion and heating properties of plasma waves, such as Alfven waves, by simulating their launch into an inhomogeneous plasma. Using the simulations, we will answer the following questions: How does the nature of the mode conversion and the energization of ions depend on the density inhomogeneity? How do they depend on the amplitude and wavelength of the mode launched from the antenna? What is the effect of the geometry of the inhomogeneity?

The experimental device in question is only about 10 times bigger than the radius at which ions gyrate around the device’s magnetic field. For this reason, and because a focus of the present study is on the kinetic heating of ions in the plasma, the magnetohydrodynamic description which describes a plasma as a fluid is inadequate. In preliminary simulations with a hybrid code (with electrons described as a fluid and ions described with the particle-in-cell model), it was determined that kinetic electron physics cannot be ignored if an accurate depiction of the ion heating process is desired, which necessitates the use of fully particle-in-cell simulations for the present study. We will use the P3D code, an electromagnetic particle-in-cell which is already in working order on NERSC computers, to carry out these simulations.

Large Scale Numerical Study of Bootstrap Current in Edge Pedestal Plasma

Choong-Seock Chang, Courant Institute of Mathematical Sciences - New York University

Associated NERSC Project: Center for Plasma Edge Simulation: SciDAC FSP Prototype Center (m499)

NISE Award: 4,000,000 Hours
Award Date: March 2011

Accurate theoretical evaluation of the edge bootstrap current has recently become a central theme in the prediction for edge localized modes instability and the pedestal equilibrium. Edge localized modes must be controlled in ITER to meet its scientific goal. Accurate evaluation of the edge bootstrap current is also critical for the determination of the magnetic separatrix configuration and the location of the confined plasma edge in ITER, without which the experimental edge physics validation in the present tokamak devices and the divertor heat load prediction for ITER suffer a significant uncertainty.

Evaluation of the edge bootstrap current has been completely based upon simplified analytic formulas, which possess rather unknown accuracy due to various approximations required in the analytic formulation or due to inadequateness of the applied magnetic geometry and plasma profile for the trans-collisional pedestal plasma in diverted magnetic field. Self-consistent ExB drift motion is another issue. Even the most developed analytical formula by Sauter, et al [Physics of Plasmas , 2834 (1999)] uses drift orderings which are inadequate for the steep pedestal, adopt “homogeneous” electron collision operator which does not possess the electron-ion momentum conservation property in the collisional edge, magnetic geometry far away from the magnetic separatrix surface, and ignores the strongly sheared ExB flows in the edge pedestal. Bootstrap current is all about the difference in the parallel flow speed between electrons and ions and the nonlocal collisional dynamics between the neoclassical trapped-passing particles, which requires accurate particle guiding center orbit dynamics in the presence of momentum conserving collisions, especially in the plasma edge where the passing particle fraction becomes smaller, poloidal variation of the phase space Jacobian is large, and the collision frequency varies significantly across the pedestal. Besides the magnetic drift motions, the ExB drift motions need to be captured self-consistently. Recent experimental evidence on MAST [IAEA Fusion Energy Conference, October, 2010, Korea] showed a significant discrepancy (over 50%) between the experimentally inferred edge bootstrap current and the theoretically predicted values from the Sauter formula. A question naturally rises on how accurate the analytic formulas are in the pedestal of conventional and spherical tokamaks.

XGC0 is a kinetic particle code, which evaluates the 3D particle guiding center motions accurately according to the Lagrangian equation of motion in a realistic magnetic field geometry. XGC0 scales efficiently to extreme scale (> 1PF). The code calculates the electrostatic potential profile self-consistently in the plasma edge. Collision operators used in XGC0 code reproduces H-theorem and Maxwellian particle distribution function in the appropriate limit, and it conserves particle, momentum and energy. XGC0 has been verified to reproduce analytic neoclassical theories in the appropriate limits. XGC0 includes Monte Carlo neutral particles with ionization and charge exchange. In this research, XGC0 is being used to perform “numerical experiments” to measure the bootstrap current in steep pedestal profile in realistic diverted edge geometry of three major US tokamaks (DIII-D, C-Mod, and NSTX). It is found that the Sauter formula agrees quite well with the XGC0 simulation results in weakly collisional regime, in which the analytic bootstrap formalism is more faithfully developed, the “homogeneous” collision operator used in the development of equation is quite accurate, and the bounce-average calculations used to generate many of the curve-fitting data are more forgiving. However, Sauter formula does not agree well with the numerical results in the collisional pedestal plasma. In conventional tokamks (DIII-D and C-Mod) about 25% difference is common between the Sauter formula and the numerical results in collisional edge pedestal. In a spherical tokamak, such as NSTX, difference between the Sauter formula and the numerical results are found to be as large is 50% in collisional edge pedestal. A new bootstrap current formula will be developed using extensive XGC0 simulations on Franklin (and Hopper II) that can bring the agreement with the numerical results within simulation error bar.

Ab initio simulation of mechanical properties of bulk metallic glass and Mo-bases alloys

Wai-Yim Ching, University of Missouri - Kansas City

Associated NERSC Project: Electronic Structures and Properties of Complex Ceramic Crystals and Novel Materials (mp250)

NISE Award: 7,650,000 Hours
Award Date: March and June 2011

This research is aimed at providing a new way to characterize the mechanical properties of materials (simple or complex). While there are several experimental techniques such as nano-indentation to characterize the strength of materials, it is not precise and requires many parameters which are difficult to obtain. The theoretical construction of the failure envelope based on accurate ab initio simulations will have a large impact in this critical area related to energy science and technology. Petascale supercomputing facility is the only way to obtain the required data. The method and procedures developed and the knowledge generated in this project can be further expanded and applied to other areas of materials design and development.

We propose to develop and implement a precise method to characterize the strength of a material by constructing a three-dimensional failure envelope in the stress space (σxx, σyy, σzz) using data generated from multi-axial tensile experiments on supercomputer. The “experiments” are performed on the supercell of a crystal by applying successive tensile strains in small steps until the stress reaches the maximum which is defined as the failure point (yield point). At each step, the structure is fully relaxed using VASP with high accuracy. The stress data σij corresponding to the applied strain in a given direction at each step are extracted. The “experiment” for each crystal is carried out in as many directions as feasible and the final set of stress (σxx, σyy, σzz) and strain (εxx, εyy, εzz) data at the failure points are collected to construct a failure envelope in the three dimensional stress space. A color code is used to represent the value of |σ| which depicts the specific locations of the weak and strong points in the first quadrant of the stress space. The total area of the envelope or the volume it encloses constitutes a single parameter representing the average strength of a crystal under tensile deformation. The shape and the color of the envelope delineate the variations of the strength in different directions. This technique has been demonstrated by us in a complex bioceramics crystal hydroxyapatite.

We will extend our study to two Mo-based alloys Mo5Si3 and Mo5SiB2 which have some very unique mechanical properties, and to a new class of noncrystalline material, the bulk metallic glasses (BMG). The Mo-based alloys within the Mo-Si-B phases diagram are targeted to replace Ni-based super-alloys for applications in fossil energy technology where the mechanical properties at elevated temperature and in the presence of microstructures play a critical role. BMG is class of noncrystalline materials with some outstanding properties such as high strength, high elastic limit, high corrosion resistance etc. and have wide applications in industry. However, it is also well known that BMG is hindered by its lack of plasticity. It is believed that this is intimately related to the formation of the shear transition zone due to the movements of free volumes in BMG. Understanding the origin of this behavior in BMG required atomistic level information which can only be obtained by accurate large-scale simulations on models of BMG.

In all, we plan to construct the failure envelopes of four supercell models, one each for crystalline phases of Mo5Si3 and Mo5SiB2, one for a BMG model Cu50Zr50 (50% Cu and 50% Zr) and a supercell model for crystalline CuZr to which the results of BMG model Cu50Zr50 will be compared. Each supercell is estimated to be of the size of 256 atoms. Previous simulations on Ti3AlC2 and Cr2AlC indicated that at least 240 data points corresponding to tensile experiments along 240 directions are needed. Once these pilot projects are completed, additional modeling on larger multi-component BMG of different compositions and constituent elements will be planned.

Improving Algorithms and Resolution for the Ocean-Atmosphere Reanalysis for Climate Applications OARCA (1850-2012)

Gil Compo, University of Colorado at Boulder

Associated NERSC Project: Surface Input Reanalysis for Climate Applications (SIRCA) 1850-2012 (m958)

NISE Award: 7,000,000 Hours
Award Date: March and June 2011

Long-term records of the global weather and climate variations from the 19th to 21st century are urgently needed for assessing the strengths and weaknesses of the next-generation of coupled climate models being used to project the effects of anthropogenic greenhouse gas emissions for the upcoming Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC AR5). Importantly, such records must have quantified uncertainties to allow a quantitative assessment. The Ensemble Kalman Filter method to be used as part of the Ocean-Atmosphere Reanalysis for Climate Applications (OARCA) provides the needed uncertainty estimates. In this project, new algorithms for improving these estimates will be tested in the atmosphere and ocean components of the system. Increased resolution in the horizontal and vertical may reduce the uncertainty in the dataset and will also be tested. The expected outcome is progress towards an OARCA system to generate a record of the state of the global atmosphere and ocean back to 1850 at six-hourly resolution in the atmosphere and 5-day resolution in the ocean.

With our previous grants of NERSC and OLCF cpu hours, we have assembled the 20th Century Reanalysis dataset (20CR; Compo et al., Quart. J. Roy. Met. Soc., 2011) spanning 1871-2008. The 20CR is the longest record of global weather and its uncertainty for use in meeting national and international goals for climate assessment and climate model validation. In a companion effort, our colleagues at Texas A&M University have generated a version 2 of the Simple Ocean Data Assimilation (SODA) dataset of 5-day averaged ocean states over the same period, using the 20CR as input. While they represent a significant achievement that is already seeing results in important areas such as understanding El Nino (Giese et al., Bull. Amer. Met. Soc., 2009), the 1930’s US Drought (Cook et al., Clim. Dyn., 2010), and Arctic warming (Wood and Overland, Int. J. Climatol., 2010), both datasets could be improved and extended back in time with revisions to their algorithms. Newly recovered observations of the historical atmosphere and ocean (e.g., via citizen-science through, professional international initiatives such as and ) will also enhance and extend future versions of both datasets. Improvements in the Ensemble Kalman Filter (EnKF) algorithm used in 20CR will allow us to extract more information from the available observations. Additionally, by coupling the 20CR and SODA systems, further improvement may be possible. Under the new DOE/NSF Earth System Modeling project Ocean-Atmosphere Reanalysis for Climate Applications (DOE PI- Gil Compo, University of Colorado; NSF PI-Ben Giese, Texas A&M) such improvements have been proposed as a way to meet national and international calls for reanalysis datasets extending to the start of the instrumental record. To test the new 20CR EnKF system, and the innovative ocean coupling concept, significant computational resources are needed on the reliable HPC systems of NERSC. This testing and algorithm development will allow us to answer some important scientific and technical questions, and to prepare the software system for the major new revision of the reanalysis dataset in 2012.

Four sets of experiments are planned in this NISE project. 1) We will test the 2011 version of the coupled atmosphere/land/ice Global Forecasting System (GFS) model from the National Centers for Environmental Prediction. Since the 2008 GFS used in 20CR, significant improvements have been made to the treatment of radiation and both shallow and deep convection. Additionally, we will contrast the 20CR EnKF performance using the new GFS at the same low-resolution as in 20CR (total spherical wavenumber “T62” ,~2 degree longitude by ~2 degree latitude and 28 vertical levels) with the new GFS at quadruple the resolution in the horizontal and more than twice the resolution in the vertical (total spherical wavenumber 254 “T254”, ~0.5 degree latitude by longitude and 64 vertical levels).

We expect the higher resolution to improve the representation of extratropical storms, tropical cyclones and convection, and the stratospheric circulation. We have found the T254 is the minimum resolution for a realistic simulation of tropical cyclones. One year of reanalysis at T62 and T254 will be generated for testing. 2) New algorithms for representing model and sampling uncertainty will be tested at the 20CR EnKF system. These tests will include the effect of uncertainty in the specified sea surface temperatures (SST), which is an important component of the total uncertainty. An adaptive method for accounting for model uncertainty through “covariance inflation” will be tested in the 20CR EnKF system using only surface pressure observations (One year of assimilation, 2 inflation algorithms, UK Met Office Hadley Centre HadISST and Texas A&M SODAv2 boundary conditions, T254 resolution). 3) Using the best of the algorithms examined in 2, we will test the incorporation of historical ship and buoy based surface wind observations, along with pressure observations in the high-resolution system. In addition to possibility improving the reanalysis fields, successful incorporation of wind observations may make historical reanalysis back to the 18th century possible, as much of the available observations before 1850 are of wind. (One year of assimilation, 1 algorithm, HadISST and SODAv2 boundary conditions, T254 resolution). 4) An experimental SODA version 3 dataset for the one year period with most promising of the above experiments will be generated. We will then repeat that experiment using the new SODAv3 boundary conditions to further analyze the effect of the uncertainty in the boundary conditions on the atmospheric reanalysis fields. We will also generate an additional SODAv3 dataset using this second version of the improved high-resolution 20CR EnKF fields to determine the likely improvements resulting from a fully-coupled system.

High resolution modeling to identify the location of the meltwater flood responsible for the Younger Dryas cold episode

Alan Condron, Pennsylvania State University

Associated NERSC Project: Abrupt Climate Change and the Atlantic Meridional Overturning Circulation (m1068)
Principal Investigator: Peter Winsor, University of Alaska Fairbanks

NISE Award: 1,000,000 Hours
Award Date: June 2011

As the rate of melt from the Greenland and Antarctic ice sheets increases in the future, the additional meltwater entering the ocean could threaten the continuation of our present-day warm climate by causing us to suddenly plunge into a very cold, glacial period. Our research aims to understand where this stability threshold lies, and how close our modern day climate is to crossing it.

Using the state-of-the-art, high resolution, Massachusetts Institute of Technology general circulation model (MITgcm), our previous research results have begun to allow us to develop a more complete understanding of the role glacial meltwater floods played in modulating the earth’s climate in the past. By modeling the discharge of meltwater from a number of different regions of the North American continent at a numerical resolution many times higher than achievable if we had not had access to the NERSC supercomputing center, we have been able to show that the delivery of meltwater to the ocean in the past was very different from the conceptual ‘textbook’ idea that meltwater created a large homogeneous fresh layer over the sub-polar North Atlantic. These results have been met with considerable enthusiasm by the scientific community and have now been published in the journal of Geophysical Research Letters

We request additional computer hours to try and resolve the on-going debate in the scientific community as to the geographic location, magnitude, and duration, of the meltwater pulse(s) responsible for triggering a ~1300 year cold episode around 13,000 cal years ago, known as the Younger Dryas. The main contenders for this title are meltwater discharged to the ocean from either via the Mackenzie River (i.e into the western Arctic) or the Gulf of St. Lawrence (western North Atlantic). A number of other sources have also been hypothesized, including the discharge of thick sea-ice from the Arctic, and meltwater from the Eurasian Ice sheet.

We will release meltwater into our model at each location, at a range of magnitudes, and differing durations (years to decades), in order to quantify which meltwater sources have the largest impact on the climate system, and therefore which had the most potential to trigger a Younger Dryas like cold episode. We will also undertake a series of integrations looking at the impact of freshwater at different states of the climate system, including full glacial and relative warm (modern) conditions. Computer hours are also requested to extend the length of several existing integrations to more fully understand the impact of freshwater on the climate system.

The research we are undertaking has a direct bearing on understanding our future climate by quantifying the impact of freshwater from accelerated melting of glaciers around Greenland and Antarctica. Due to the computationally demanding nature of these experiments – integrations typically utilize 1800 processors - access to the supercomputing center at NERSC is required to carry out our proposed experiments. Our integrations have typically been run on Franklin, but we have recently had considerable success integrating our configuration on NERSCs new Hopper II machine.

Numerical Investigations of Symmetry Breaking, Magnetism and the Pseudogap in the Three-Orbital Hubbard Model

Thomas Devereaux, Stanford Linear Accelerator Center

Associated NERSC Project: Simulation of Photon Spectroscopies for Correlated Electron Systems (m772)

NISE Award: 1,000,000 Hours
Award Date: March 2011

Our proposed research addresses two central issues regarding the pseudogap regime and the magnetic phase of copper-based high-temperature superconductors. The proposed simulations would provide clear experimental predictions and help unravel the nature of these phases in the cuprates, which is now considered the key to understand their high-temperature superconducting mechanism.

We propose simulations using a massively paralleled exact diagonalization (ED) algorithm to address two central issues in the field of high temperature superconductivity. These problems concern the nature of the pseudogap regime in the phase diagram of cuprate superconductors and the two-magnon Raman spectra performed on these materials. The proposed simulations with our state-of-the-art ED code are well suited for high-performance parallel computing architectures and benefit directly from the large-memory nodes at NERSC.

While it is widely believed that an understanding of the way in which antiferromagnetism disappears with oxygen doping in the cuprates holds the secret to understanding high temperature superconductivity, the region that lies between the two - the so-called pseudogap is poorly understood. One theory compatible with recent experiments involves a circulating current looping around the Cu and O atoms. However, this model is highly disputed as no experiment to date can directly probe such an order parameter. Alternatively, experimental observations could be instead interpreted as a particular ordering of spin moments on oxygen atoms. Results from numerically-unbiased techniques such as ED and DMRG remain controversial because prior work either neglected the effects of multi-orbital characters or worked in restricted dimensions. To address this issue, here we propose performing large-scale ED calculations on a multi-orbital Cu8O16 cluster. With the ground state wavefunction obtained from numerical diagonalization, we can directly test the existence and stability of the circulating current phase by studying its correlation function. Results from these state-of-the-art simulations would serve as clear experimental predictions to benchmark the existence of a circulating current, and to place an upper bound on its strength if such a phase exists. These simulations would thus substantially advance our current understanding of high temperature superconductivity.

Magnetic Raman scattering detects local two-magnon excitations and is instrumental in the first accurate determination of the exchange interactions in cuprate superconductors. A correct description of magnetic scattering as a function of doping will tell us how antiferromagnetism becomes short ranged and what are the characteristics of short range magnetism. After over 20 years of investigations, the Raman lineshape is still poorly understood. In this NISE application, we further propose calculating two-magnon Raman spectra with the ED technique on a 40-site square Heisenberg cluster. Our proposed calculations of two-magnon Raman cross-sections would be the first ever computation of dynamical quantities performed on such a cluster. Besides simulating the two-magnon spectra to compare directly to experiments, we will also calculate the first few spectral moments. Magnetic Raman scattering is a robust measure of the underlying spin interactions, and it can also provides crucial information on the amount of magnetic frustration. Therefore, our proposed simulations would concern not only the cuprates, but also have a broader applicability to other antiferromagnetic systems.

Breaking Magnetic Field Lines with Turbulence

James Drake, University of Maryland

Associated NERSC Project: Turbulence, Transport and Magnetic Reconnection in High Temperature Plasma (mp217)
Principal Investigator: William Dorland

NISE Award: 500,000 Hours
Award Date: March 2011

Magnetic reconnection negatively impacts energy confinement in laboratory fusion experiments and explosively releases energy in space and astrophysical plasmas. Dissipation is required to break field lines and release magnetic energy. It has long been hypothesized that the intense current layers produced during reconnection could drive turbulence. We have identified and will explore a new class of turbulence that dominates the dissipation during reconnection and will produce a new understanding of how magnetic energy is released.

Magnetic reconnection is the dominant process by which magnetic energy in a plasma system is dissipated. In laboratory fusion experiments reconnection the onset of reconnection negatively impacts confinement and in extreme cases can lead to the termination of the plasma discharge. Solar flares on the sun and substorms in the magnetosphere lead to explosive releases of magnetic energy that can negatively impact power grids, satellites and communication. A key scientific issue of over 50 years has been to understand the mechanism that provides the dissipation that is necessary to break magnetic field lines and enable reconnection to proceed. The dissipation produced by classical collisions is too weak to explain the laboratory and space observations. A widely discussed mechanism for producing this dissipation is through the development of turbulence in the intense current layers that form during reconnection. It had been hypothesized but never demonstrated that this turbulence could scatter electrons and therefore produce an "anomalous resistivity" that facilitates reconnection. We have demonstrated in a series of simulations and in a paper that is being reviewed by Nature that "anomalous resistivity" is not the dominant mechanism. Instead the narrow current layers that are produced during reconnection breakup into a filamentary web as a result of an electron shear-flow instability. This behavior can be represented by a turbulence-induced viscosity rather than a resistivity. Thus, "anomalous viscosity" is the dominant mechanism facilitating reconnection in the high temperature regime of greatest interest in the laboratory and space.

Over the past year we have carried out a series of simulations of reconnection that first uncovered the filamentation process and then tested the range of parameters for which it was important. A surprise was that a system with a modestly strong guide field, which is the most relevant casefor fusion and space and astrophysics, exhibits a greater tendency for filamentation than reconnection with no guide field. Thus, with simulations and modeling to date we have established the robustness of the instability but because the development of turbulence requires the exploration of a 3-D system (the turbulence develops along the symmetry direction of a traditional 2-D reconnection model), the simulations carried out thus far are only marginally separating the characteristic scale lengths of the turbulence from that of the macroscale reconnecting system. Further the small size of the total system limits the time over which reconnection takes place before depleting the available free energy so that there is insufficient time to be sure that the turbulent-induced breakup of the current layer has come into balance with the tendency for reconnection to strengthen and narrow the current layer.

Thus, we can not be certain about the impact of turbulence on the rate of reconnection until the turbulence has come into balance with the reconnection drive. The tendency for turbulence to overshoot is a common characteristic of the onset of turbulence in plasma systems.

Thus, the important next step in our exploration of this exciting new area of research is to carry out simulations with a greater separation between the turbulence and system scales. Such simulations will enable us to pin down how turbulence influences the rate of reconnection by following the development of turbulence and the rate of reconnection until both have come to a balance in the full 3-D system and by comparing the results to those obtained in the computationally simpler 2-D system.

The specific simulations that will be carried out will have a computational domain of 16 x 8 x 8 in units of ion inertial lengths, and an ion-to-electron mass ratio of 100. The box would be 2048 x 1024 x1024$ grid points and contain roughly around 10^{11} particles. Our code has shown essentially perfect scaling up to 8192 processors. Extrapolating its performance to 32,768 processors we expect such runs to take around 20 wall-clock hours, for a total of 650,000 processor-hours. Runs with at two different values of the ambient guide field (with a strength that is equal two and twice the reconnecting magnetic) are presently planned.

Development of CALPHAD type approach to thermal conductivity aided by first-principles phonon approach

Huazhi Fang, Pennsylvania State University

Associated NERSC Project: Investigations of phase stability and thickness issues in interconnects for solid oxide fuel cells through first principles calculations and computational thermodynamic modeling (m1215)

NISE Award: 500,000 Hours
Award Date: June 2011

The proposal aims to develop a fundamental database of thermal conductivity, which is essential to the design of materials involved in the process of heat transfer. High thermal conductivity materilas are highly demanded in many fields, such as to facilitate cooling in microelectronics. Whereas low thermal conductivity materials are also required in heat preservation applications.

Thermal conductivity is of a fundamental physical property of materials. Materials with high thermal conductivity are being pursued in various fields, such as mircoelectronics, thermoelectrics and turbine engine. Whereas, low thermal conductivity is required for materials with the fucntion of heat preservation. The purpose of current proposal is to develop a database of thermal conductivity with the Calphad approach aided by first-principles phonon approcah. The Calphad type approach is a valuable methodology for prediction of phases and thermodynamic type properties related to the phases present in complex multi component alloys. The first-principles approach is used to calculate the input data for the Calphad approach. The quantities to be modelled by the first principle calculation include the group velocity, the phonon lifetime, the phonon Grüneisen parameter, and the Debye frequency.

For simple systems, the phonon properties will be calculated using first-principles phonon approach for both polar materials and metals. For more complicated system, the CALPHAD type approach will be employed to model the transferable force constants in calculating phonon properties. The systems we will firstly focus on are the Al-based, Mg-based, Ni-based and Ti-based alloys.

An Efficient Real-Space Approach to Calculating Auger Spectra

Keith Gilmore, Lawrence Berkeley National Lab

Associated NERSC Project: Understanding and controlling semiconductor nanocrystal surfaces (m1141)
Principal Investigator: David Prendergast, Lawrence Berkeley National Lab

NISE Award: 800,000 Hours
Award Date: March 2011

At the Molecular Foundry we are members of an experimentally driven Grand Challenge project that seeks to understand how the surface environment of semiconductor nanoparticles influences their electronic and optical properties. Nanoparticles are routinely proposed for use in a wide variety of applications including photovoltaics, solid-state lighting, and bio-imaging. However, the behavior of these particles depends sensitively on their surface environments and this dependence has been insufficiently investigated. We are in the process of correlating various spectroscopic techniques – both experimental and computational – in order to better understand the influence of surface chemistry on the properties of nanoparticles.

Many varieties of x-ray spectroscopy exist and each type gives information about some aspect of the local electronic and chemical structure of a material. Auger spectroscopy interrogates the occupied valence states while x-ray absorption spectroscopy probes the unoccupied conduction levels. Combining these complementary data sets with structural information obtained from electron microscopy and numerical relaxation calculations will allow us to construct a whole picture of the influence of surface environment on the physical properties of nanoparticles.

We propose to develop a highly parallel code to efficiently evaluate Auger spectra using real-space techniques. While many electronic structure programs are based on plane-wave Fourier-space approaches, electronic excitations and decay processes are typically localized in space. In order to accurately describe a localized disturbance an unreasonably large number of plane waves must be used, making such approaches very inefficient. Real-space approaches solve this problem by allowing for a much smaller basis set.

An efficient real-space code for calculating static dielectric screening has already been developed by Eric Shirley of NIST [1]. As a previous postdoc of Dr. Shirley, the PI proxy Keith Gilmore is very familiar withthis code and has already extended it to the dynamic screening case, which will be necessary to capture Auger decay processes. Remaining code development includes constructing the electron self-energy from the dynamic screening and evaluating the direct and exchange coulomb matrix elements between the initial and final states. Both of these steps are routine and can be completed quickly. The resulting code will be folded into the OCEAN / Abinit-NBSE package described in the following section, and made publicly available.

Auger decay cross-sections will be calculated for both core-core-valence and core-valence-valence cases similarly to the method described in [2]. In each case, the initial state contains one hole while the final state contains two holes. Eigenstates and energies of the single-hole initial states will be determined within the GW method while the GW-based Bethe-Salpeter equation will be solved to find the two-hole final states.

Surface Instabilities and Subgrid Scale Models in Computational Fluid Dynamics

James Glimm, Stony Brook University

NISE project m1289

NISE Award: 500,000 Hours
Award Date: March 2011

Turbulent mixing and sub grid scale models (SGS) for large eddy simulation remain as a major challenge for computational fluid dynamics. Building on prior success with front tracking and use of SGS models, we propose to conduct validation and verification simulations for a new Front Tracking incompressible flow solver with dynamic SGS turbulence models. Two flow regimes are proposed for these tests, chosen because of (a) prior experience of the proposers, (b) intrinsic importance of the problems to DOE, and (c) availability of experimental data. The flow regimes are (1) high speed two phase Couette flow, as arises in a contactor used for reprocessing of spent nuclear fuel from nuclear power reactors and (2) Rayleigh-Taylor mixing problems, of importance as a test bed for ICF capable simulation codes.

Preliminary simulations for the Couette flow problem have been completed, and here we propose to study properties of the mixing regime as a function of the volume fractions of the two fluids, a test for which there is experimental data.

A series of Rayleigh-Taylor simulations have been completed, using a related compressible code. These have given successful comparison to experiment in the overall growth of the mixing layer (the famous "alpha" problem). Uncertainties associated with unmeasured initial conditions have been largely eliminated by use of early time experimental data, and remaining uncertainties have been quantified and the effect on the growth rate for the mixing zone has also been quantified, showing agreement with experimental data. Here we propose to study molecular mixing properties. For this purpose a more efficient incompressible code will be used, to facilitate increased resolution. Good comparison to experiment was achieved for experiments of M. Andrews at early time, but improved resolution appears to be needed to go to later time in this comparison.

We emphasize the fundamental scientific issues at play in this investigation. Validated and verified simulations of chaotic flow regimes are important to a number of DOE problems, and methodologies to achieve this goal will have significance beyond the present investigation.

This proposal will allow improved fidelity in the study of microscopic mixing in a commonly considered test problem of Rayleigh-Taylor instability. We will be able to improve on the mesh resolution due to the incompressible nature of the simulation, and we hope thereby to improve on the match to experiment. The incompressible simulation will also improve on the accuracy of the physical modeling by allowing removal of some modifications to the physical parameters that were needed for a compressible simulation.

It will allow study of the majority vs. minority phase in a turbulent two phase Couette flow, as a function of the volume fraction of the two fluids and as a function of the radius. We will also conduct a separate study of a gas-liquid two phase Couette flow to determine the proper rotation speed at the air gap - liquid boundary for three phase mixing.

The flows studied are important for DOE missions (nuclear fuel reprocessing and ICF). The problems studied occur in a number of DOE problems, and the solutions proposed will be helpful outside of the specific context studied. While turbulent and chaotic flow fields are generally understood to be statistical in nature, an accepted methodology for statistical convergence of random fields does not appear to be a well established part of the numerical analysis culture. The proposed simulations will serve to contribute to an understanding of the proper nature of statistical convergence as it is to be verified and validated in a simulation study.

Self-Healing of Polymer Films

Gary Grest, Sandia National Laboratories

Associated NERSC Project: Drying and Self-Assembly of Nanoparticle Suspensions (m1151)

NISE Award: 3,500,000 Hours
Award Date: June 2011

Polymers play an important role in a wide range of applications from coatings and packaging to lithography and structural materials. After time due to either mechanical stress or aging, cracks can develop in the polymer leading to mechanical failure of the structure. The goal of the proposed project is to examine how cracks in a polymer self-heal when heat is applied to raise the temperature about the glass temperature. The proposed simulations will correlate the evolution of the polymer interface to the mechanical strength of the crack. Correlations between interface structure, entanglements, and mechanical properties will then be determined. These will provide fundamental understanding of the factors that control the fracture and shear strength of polymer interfaces and develop models for predicting performance.

The strength of polymeric materials has a direct effect on their performance in a wide range of polymer components. After time due to either mechanical stress or aging, cracks can develop in the polymer leading to mechanical failure of the structure. The goal of the proposed project is to examine how cracks in a polymer self-heal when heat is applied to raise the temperature about the glass temperature. The proposed simulations will provide direct insight into the connection between the microscopic structure of polymer interfaces and their macroscopic mechanical properties. Complete information about the state of the polymer interface during self-healing, including the entanglements, density profile and distribution of short chains and chain ends will be available. Then mechanical properties including failure stress, fracture energy, and friction will be calculated for a range of interdiffusion times and compared to the mechanical properties of the bulk polymer. A parametric study will allow the trends with interdiffusion time and polymer stiffness to be determined. The impact of these factors on failure stress under tensile and shear loading will be evaluated. The results will test existing models, guide the development of new models, and aid in the interpretation of experiments and the improvement of the self-healing of polymeric materials.

The goal of this project is to examine how the shear and tensile strength of self-healing polymer interface depends on the degree of interdiffusion across the interface. The mechanism of this increase will be determined by combining microscopic analysis of structure with the evaluation of macroscopic mechanical properties. The macroscopic properties of bulk polymers are strongly influenced by the topological entanglements between polymer chains. The role of entanglements near interfaces in determining the increase of the interfacial strength with interdiffusion time will be determined. One question will be how entanglements between polymers from opposite sides of the crack affect strength, particularly in the presence of short chains which occur due to chain scission at the interface. Is entanglement density alone sufficient to confer strength, or do the entanglements need to be between polymers from the two sides of the joint? The first entanglements are likely to form near chain ends. Do they strengthen the interface or are they easy to pull out? There is a substantial literature on thermal welding and the related process of crack-healing. Experiments exhibit a power law dependence of fracture energy on contact or welding time that can be explained by models based on the rate of interdiffusion. However there are several microscopic models for how interdiffusion strengthens the interface based on the number of chains crossing the interface, the interpenetration depth, “effective crossings” by some minimum depth, and an effective entanglement density. The mechanism must involve entanglements in some manner since they are critical to the bulk fracture energy, but the details remain unknown.

In this initial study of self-healing of a crack, we will follow the interdiffusion of polymers across a interface produced by cutting all bonds along a polymer chain which cross the mid-plane of the polymer. Due to the slow time scales we will model the polymer using a well established bead-spring model. By varying the bending stiffness between monomers along the polymer chain, we can vary the entanglement length N_e. In the proposed study we will model long polymer chains of length N=500 beads so that the polymer is well entangled. We will study a full flexible chain which has an entanglement length N_e ~ 85 and a more rigid polymer which has an entanglement length N_e~28 so that we can compare the effect of self-healing on polymer stiffness. Since the average end to end distance of these two systems are approximately 28 and 36, respectively, the system size has to be quite large so that the simulation cell is much larger than the end to end distance in all three spatial directions. Thus the two simulations proposed here are for 9600 chains of length 500 or 4.8 million monomers. Based on the known diffusion times for these two systems in the bulk, the simulations will be need to run for between 300 to 500 million time steps for the crack to heal. At different stages of the self-healing, we will quench the temperature to below the glass transition temperature and determine the mechanical strength of the interface and compare to that of the bulk. We will also do a primitive path analysis to determine the degree of entanglement between the two sides of the crack.

Computational Studies of Methanol Steam Reforming

Hua Guo, University of New Mexico

Associated NERSC Project: Quantum dynamics of elementary reactions in the gas phase and on surfaces (m627)

NISE Award: 300,000 Hours
Award Date: March 2011

The proposed research aims at the elucidation of the reaction mechanism for methanol steam reforming, a process that generate H2 fuel for fuel cells. A better understanding of the catalytic process can help designing new and more effective catalyst.

Environmental and geopolitical concerns over fussil fuels have stimulated exploration of alternative energy sources. One promising approach to clean fuel cell applications is the on board generation of hydrogen from methanol steam reforming (MSR), which bypasses the difficult problem of hydrogen storage. The current catalyst (Cu/ZnO) has a few undesirable properties such as sintering at high temperatures. A more thermally stable catalyst (PdZn/ZnO) has been discovered recently. However, there is poor understanding of the catalytic mechanism of MSR. This research project will explore the reaction mechanisms for the steam reforming of methanol on various metal/oxide surfaces, in collaboration with an experimentalist (A. Datye at UNM). In particular, we will be focusing on various Cu and the alloy PdZn surfaces, and will map out the reaction pathways using the plane-wave density functional theory implemented in the VASP.

Laser Acceleration of Ions Using Few Times Critical Density Gas Targets

Michael Helle, Naval Research Lab

Associated NERSC Project: High Energy Laser-Driven Acceleration Based on the Laser Wakefield Accelerator (mp343)
Principal Investigator: Phillip Sprangle, Naval Research Lab

NISE Award: 1,000,000 Hours
Award Date: June 2011

Our research will focus on the generation of high energy ion beams using a high intensity laser hitting a gas target. The high energy ion beams produced by this interaction can be applied to such diverse fields as ion beam fusion, the detection of special nuclear materials, and ion beam therapy for various types of cancer. Gas target are particularly attractive since they can be operated at high repetition rates. The high repeatability of our scheme will allow for real time scanning, a feature that is currently lacking. Another key advantage to this type of acceleration is that it can dramatically reduce the size and cost of producing the beams necessary for these applications. The compactness and reduction in cost of such systems provides a unique opportunity to introduce national laboratory like relativistic ion beams to private organizations, universities, and medical facilities.

The generation of high energy ions by means of high intensity laser irradiation of solid targets has been a subject of active research for over a decade. More recently, experimental groups at both Brookhaven National Laboratory and UCLA have shown ion acceleration using CO_2 lasers interacting with gas jets that, when ionized, yield plasma densities that are a few times critical density. The advantages of such targets are that they are relatively simple and can be easily operated at high repetition rates.

The physics that drive this type of acceleration is not yet well understood. Of particular interest is the scaling of such acceleration to various laser pulse parameters (including multiple pulses) and the effect of the longitudinal plasma density profile on the acceleration process. Additionally, since the plasma is only a few times critical density, frequency upshifted radiation is able to propagate deeper into the target which could lead to interesting new physics in itself.

Preliminary simulations using the PIC code TurboWAVE have been made in 2D. These have already proven to be very insightful. Specifically the simulations show the effect of the ponderomotive force on acceleration when moving from 1D to 2D and the generation of unique forward scattered radiation patterns. The ponderomotive force acts to push electrons away from the laser axis, creating a charge separation between the ions and electrons. The space charge fields setup by this separation leads to the acceleration of the ions. In 1D, the electrons can be pushed only in the direction of the laser pulse which tends to over estimate the amount of charge separation. By moving into 2D, the electrons are able to be pushed in the laser propagation direction as well as the direction transverse to that. This is due to the finiteness of the laser pulse's transverse profile. Moving to 2D yields a more realistic picture of the physical process and hints that it maybe advantageous to use a shorter pulse to produce greater longitudinal charge separation before the electrons are able to move transversely off the laser axis.

We wish to further explore this type of acceleration by moving to 3D. Here we will be able to examine the full effects of the ponderomotive force, particle slippage, and other nonlinear effects. To do this we plan to continue to use the code TurboWAVE. TurboWAVE is a particularly strong code since it exhibits strong scaling up to 10^4 processors (demonstrated) for similar problems and has been well benchmarked in 3D against Laser Wakefield Acceleration experimental results.

An Integrative Strategy to Model Complex Biological Assemblies

Ivaylo Ivanov, Georgia State University

Associated NERSC Project: Modeling Assemblies and Interactions in Eukaryotic Clamp Loading (m1254)

NISE Award: 960,000 Hours
Award Date: June 2011

Conventional structural biology methods have provided valuable but fragmentary insight into the architecture of the ternary assemblies of DNA with the proteins FEN1 and PCNA. These assemblies play a crucial role in replication - the common mechanism through which all organisms duplicate their genetic material. Our computational work would take advantage of state-of-the-art molecular modeling methods to elucidate the structure and dynamics of these important replication complexes.

DNA replication is accomplished by a dynamic protein assembly termed the replisome. The central goal of our work is to develop novel multiscale computational protocols, which would allow integrative modeling of protein-nucleic acid complexes within the replisome. Understanding the inner workings of this complex molecular machine is undeniably among the great challenges in the biological sciences. Breakthroughs in this field are expected to yield great advances in biomedicine specifically in the diagnosis, prognosis and treatment of cancer and genetic disorders. Furthermore, the fundamental biological mechanisms and the dynamic protein associations in DNA synthesis and repair are conserved across bacterial, archaeal and eukaryotic organisms and govern many metabolic processes occurring in response to environmental stress as cells strive to regain homeostasis. In this respect, our work also parallels ongoing effort at Lawrence Berkeley National Laboratory (LBNL) in microbial genomics aimed at unraveling the functions of microorganisms that are environmentally important or related to DOE’s bioenergy initiatives. Microbial genomes and metabolic responses to environmental stresses such as ultraviolet radiation, ionizing radiation and oxydative stress are of direct relevance to the mission of the DOE. Our project is also well aligned with the stated goals of the DOE Biological Systems Science division “to develop the computational capabilities and systems needed to predictively design and model complex biological systems” and “to develop and support DOE national user facilities for use in fundamental structural biology”.

Many individual components of the cell's replication machinery have already been solved by protein crystallography. How these components come together to form a functioning molecular machine has, however, remained elusive. Furthermore, detailed structural characterization of large protein and nucleic acid assemblies is often impossible by any single experimental or computational method. Therefore, advancement of this field would necessarily involve integration of knowledge from a variety of sources – biochemical and biophysical experiments as well as molecular simulation and structural bioinformatics. For instance, electron microscopy (EM), small angle X-ray scattering (SAXS), fluorescence (FRET), footprinting, chemical cross-linking are among a number of methods, which could collectively be used to overcome the limitations inherent in static crystallographic structures and probe the assembly and conformational dynamics of macromolecular complexes. Each of these techniques comes with its own limits of accuracy, applicability and spatial/temporal resolution. Achieving a consistent molecular view will, therefore, require advanced computational methods in order to integrate experimental data from vastly disparate sources. A central goal of our work is to develop such novel multiscale computational protocols and apply them to replisomal complexes.

Within the replisome, replicative polymerases rapidly and faithfully duplicate the cell’s genetic material prior to cell division. Proliferating Cell Nuclear Antigen (PCNA) serves as an accessory protein whose role is to topologically tether these polymerases to DNA and ensure processive replication. Beyond involvement in replication, PCNA is also a recognized master coordinator of cellular responses to DNA damage and interacts with many DNA repair proteins and cell cycle checkpoint inhibitors. In this capacity, PCNA serves not only as a mobile platform for the attachment of these proteins to DNA but, importantly, plays an active role in the recruitment and release of these crucial participants at the replication fork. These are key processes in PCNA biology, which are incompletely understood from a structural point of view. This proposal puts forward an integrative strategy to model the ternary assemblies of Flap-endonuclease 1 (FEN1) with its double flap DNA substrate and with PCNA (or with the alternative heterotrimeric sliding clamp Rad9-Rad1-Hus1 (9-1-1)). While structural snapshots are available for the individual components of these assemblies and, in some cases, for the binary complexes (PCNA-DNA, PCNA, FEN1-DNA, 9-1-1 complex), the larger assemblies present extreme challenges to molecular crystallography (MX) and will require integration of data from a variety of sources (EM, SAXS, MX) through advanced computational methods. A modular multi-scale approach will be adopted wherein (i) large conformational ensembles for the complexes are generated at first through advanced sampling methods and (ii) these ensembles are systematically reduced to eliminate conformations incompatible with existing experimental constraints. The final refined models will be subject to experimental validation and guide future experimental efforts in this exciting research area. The exceptional computing resources available at NERSC will create opportunities for this work on the cutting edge between computation and structural biology that will lead to deeper understanding of the inner workings of biological assemblies engaged in replication.

Unveiling Microbial Carbon Cycling Processes in Key U.S. Soils using "omics"

Janet Jansson

NISE project m1317

NISE Award: 1,000,000 Hours
Award Date: May 2011

Denoising, filtering, assembling and annotating millions of genes from microbial communities associated with diverse environments such as soils are all very computationally demanding processes. The newset sequencing technologies are producing terabases of sequence data. The only solution to analyzing genomic information of this size is through high performance computing, where hundreds of nodes and terabytes of memory are available. Using the state-of the-art NERSC resources would allow us characterize the functional potential of these complex soil environments, discover potentially novel genes, and assemble environmental genomes.

Semiclassical Approaches for Clean Energy Resources

Puru Jena, University of California Berkeley

Associated NERSC Project: Cluster and Nanostructure for Energy and Bio Applications (m847)

NISE Award: 100,000 Hours
Award Date: June 2011

Since the discovery of the fullerene C60 by Kroto et al. in 1985, followed by the production of larger fullerenes and carbon nanotubes by Iijima2 in 1991, it started a different field of investigation devoted to the study of the many electronic and structural properties of this class of nanoobjects. These experimental findings have also led to an intense effort towards finding analogs of the carbon nanotubes and fullerenes involving carbon, boron, and nitrogen. Fullerenes have been extensively investigated as promising buiding blocks for nanoelectronic devices, which implies the search for fullerenelike structures based on other elements rather than carbon and the interaction of carbon fullerenes with other chemical groups or atoms. Due to graphene’s unique dispersion relation, electrons and holes behave like a two-dimensional 2D gas of massless Dirac fermions, which allows to probe quantum electrodynamics phenomena in condensed matter. On the other hand, a new class of highly electronegative species can be synthesized if the peripheral halogen atoms are replaced by superhalogen moieties, which are named “hyperhalogens”, because their electron affinities can even be larger than those of their superhalogen building blocks and thus can serve as ingredients in the synthesis of new superoxidizing agents. A superhalogen consists of a central metal atom surrounded by halogen atoms. When the number of these halogen atoms exceeds the maximal valence of the metal atom, the molecule possesses electron affinities that are much larger than that of the halogen atoms. Our goal is to also investigate electronic properties of magnetic superhalogens. Our first-principles methodology is based on the density functional theory DFT implemented in the VASP program. We make use of the generalized gradient approximation GGA for the exchange-correlation functional.

Tall Tower Wind Energy Monitoring and Numerical Model Validation in Southern Nevada

Michael Kaplan, Desert Research Institute

NISE project m965

NISE Award: 750,000 Hours
Award Date: January 2011

In order to further assess the wind energy potential for Nevada, a new wind energy monitoring tower platform will be established in southern Nevada and the accuracy of several computational numerical models used to estimate the wind resource will be evaluated using data collected from the modeling platform. This project will concentrate on establishing the platform and the preliminary evaluations of the numerical models.

The 120 meter tower will provide continuous wind monitoring data at wind turbine hub and blade heights. Data from the platform and the evaluation numerical models' accuracy will improve wind resource maps, and identify the wind shear and turbulence regimes at the wind turbine hub and blade heights. The results of this project will be transferable to other Southwestern areas whose climates are similar to southern Nevada and the surrounding region.

The Operational Multiscale Environment Model with Grid Adaptivity (OMEGA) will be used for static adaptive grid simulations in the geographic region of the wind measurement platform. Preliminary evaluations of the accuracy of OMEGA will be made by comparing model estimates of wind speed and direction to speeds and measured by available meteorological instruments and towers in the region.

The present research evaluates the ability of OMEGA to reproduce point winds as compared to the observational data from the Stone Cabin Tower (near Tonopah, NV) at 40 m, 60 m, and 80 m. Model sensitivity to horizontal grid resolution, initial conditions, and terrain dataset resolution will also be tested. OMEGA will be run over five different horizontal grid resolutions with minimum horizontal edge lengths of: 18 km, 6 km, 2 km, 666 m, and 222 m. For each resolution, the model will be initialized using both the Global Forecasting System (GFS) and North American Regional Reanalysis (NARR) at both 00 and 12 GMT to determine model sensitivity to the resolution and time of the initial conditions. Additionally, the 666m and 222m minimum grid resolution runs will be run with both a 90m and 1km resolution terrain database to determine the sensitivity to terrain features. Each group of model runs over the 30- day period of interest will then be analyzed using statistical techniques to determine how the model-generated winds compare with the observed winds. The statistical results will then be compared with results from MM5 and WRF simulations to determine the most appropriate model for wind energy potential studies in complex terrain.

Threading EnergyPlus Using OpenMP

Noel Keen, Lawrence Berkeley National Laboratory

NISE project m1116, Principal Investigator: Philip Haves

NISE Award: 100,000 Hours
Award Date: March 2011

EnergyPlus is an industry standard building simulation code. It models the energy performance of any type of building including effects such as HVAC systems, various window materials, accurate weather considerations, solar shading, occupancy rates as a function of time, thermal comfort of occupants, etc. EnergyPlus has a large user base and is often used by architects when designing buildings.

The EnergyPlus code is a large serial F90 code that is being developed at several institutions, including LBL in EETD. I have been tasked with improving the performance and have been trying to gain some of that improvement using OpenMP. The Carver and Hopper machines could be very valuable withthis work as I need to test many different parameters and versions of loop threading over a wide range of input. It is also very nice to have two modern and different CPU's to perform the benchmarks. Currently the focus is one shared memory node.

First-Principles Calculation of Phonon-Assisted Optical Absorption in Indirect-Band-Gap Semiconductors.

Emmanouil Kioupakis, University of California - Santa Barbara

Associated NERSC Project: First-principles modeling of nitride and oxide materials for applications in electronic and optoelectronic devices. (m934)

NISE Award: 1,000,000 Hours
Award Date: March 2011

Electrons in materials can absorb light by taking the energy of the incident radiation and jumping into a higher-energy excited state. In some materials however, such as silicon, this transition cannot happen in a single step and needs to be assisted by another mechanism that can provide some extra momentum. One mechanism that can give the extra push is the interaction of the electrons with the atoms of the crystal. Silicon solar cells, which dominate the photovoltaic industry, rely on this assisted process to absorb the incident sunlight and convert it to electricity. The goal of out research is to investigate this indirect light absorption process at the microscopic level and understand in detail how the silicon solar cells operate.

Optoelectronic materials are at the heart of CD/DVD/Bluray players, LED displays, solar cells, and fiber-optic communication systems and are poised to dominate in the areas of general illumination, projectors, and photovoltaic power generation. Understanding the way light interacts with charge carriers in materials is an important fundamental question that has paved the way to this array of technological applications. At present, the methodology to understand direct optical processes from first principles, including many-body effects and the electron-hole interaction, is well established. Another kind of optical processes are those that are mediated by an additional scattering mechanism, such as the coupling of the carrier motion to the lattice vibrations of the crystal. Such indirect optical processes are responsible, among other things, for the optical absorption of visible light in silicon and are hence at the core of the present silicon photovoltaic cells. These phonon-assisted optical absorption processes in indirect-band-gap semiconductors have not been investigated fully from first principles before because, due to the large number of initial and final states involved, the associated computational cost is prohibitively high.

In the context of our work on the efficiency issues in nitride lasers we investigated phonon-assisted optical absorption for the special case of absorption by free carriers near the band extrema of GaN. We found that these phonon-assisted processes are responsible for the internal reabsorption of the generated light in nitride lasers and proposed ways to mitigate the impact of this loss mechanism. As a supplemental research initiative related to our existing ERCAP proposal, we plan to generalize the methodology we developed and apply it to study the phonon-assisted absorption of light in technologically important indirect-band-gap semiconductors (such as Si, Ge, and GaP). The computational cost of the calculations can be substantially reduced by employing the interpolation formalism based on maximally-localized Wannier functions, which can yield the necessary band energies and electron-phonon coupling matrix elements for arbitrary pairs of states at a minimal cost. The resulting insights will shed light on the nature of phonon-assisted optical absorption in indirect-gap materials and the operation of silicon-based solar cells.

Deep Defect States and Valence Skipping in Narrow Band Gap Semiconductors

Mal Soon Lee, Michigan State University

Associated NERSC Project: Revolutionary Materials for Solid State Energy Conversion (m968), Principal Investigator: S. D. Mahanti

NISE Award: 340,000 Hours
Award Date: March 2011

This project will help us to understand why a particular structure is energetically stable for a given system and why in some structures two different types of local geometry and/or different valence states appear in narrow band gap semiconductors. In addition, it will help us to understand how defect states depend on this mixed valency.

The origin of valence skipping in low dimensional semiconductors and the role of mixed valency on the nature of deep defect states in narrow band gap semiconductors (of thermoelectric and photovoltaic interest) are of great current experimental and theoretical interest. In this proposal we address two fundamental questions in these systems. They are (1) the relationship between competing local bonding, local geometry, mixed valency (skipping valency), and the global structure (quasi 1-dimension and/or 2-dimension) in a class of important binary chalcogenides MX, where M=Ga, In, Tl; X=e, Te, and (2) the nature of deep defect states in these binary systems.

Material Simulations in Joint Center for Artificial Photosynthesis (JCAP)

Nathan Lewis, California Institute of Technology

NISE project jcap

NISE Award: 5,000,000 Hours
Award Date: March 2011

JCAP is a DOE funded energy hub at the $24M/year funding level. The goal is to develop systems which absorb lights and generate chemical fuels (e.g., water splitting, or convert CO2 to methonal). Material science research is at the heart of JCAP activities. More specifically, there are the following areas of researches: (1) Light absorber: finding the appropriate light absorber with the sufficient light absorbing ability and output enough potential for water splitting; (2) Catalysis: find the cheap and efficient alternative to the H- reduction and OH+ oxidation catalysts; (4) Membranes: find the membranes or other related designs which can separate the H2 and O2 gases, while permits the transport of proton; (5) Linker: find the molecule or solid state linkers, which lead the electron and hole from the light absorber to their respective catalysts. In all the above areas, theoretical calculations play an important role. They help to guide the experiment, narrow down the scope of experimentally to be tested materials, help to understand the processes and design the new catalysts.

In JCAP, we have about 7 theoretical groups (PIs) who are involved in doing material science simulations. Most of their simulations use large scale computations. These might include: DFT calculations for thousand atom systems for their atomic structures; GW calculation for 50 atom systems for their band gaps and band alignments; transport calculations for the electronic transport in the linker molecules; classical and ab initio molecular dynamics simulations for the catalytic processes; high order quantum mechanical calculations for the catalytic middle steps; and possibly massive atomic structure search either fornew crystal compounds, interface structure, and molecule/surface attachments. All these might need large computer resources. Each of the theoretical group might have a few postdocs. working in the JCAP project. Thus in total, we can have 20-30 users. Most of the groups have current NERSC accounts. But they are dedicated to other researches, not for JCAP activities. It is essential for us to have a dedicated account for JCAP calculations. On the other hand, all these calculations can be related to their original research in the original ERCAP account in terms of the techniques they are using. They are an extension to the JCAP scientific activity.

More specifically, we plan to carry out the following calculations: (1) Interface atomic structure between two semiconductor or semiconductor and metal. This is important especially for amorphous interface where the interface atomic structure is not well known, and this atomic structure affects the electronic structure of the interface; (2) Doping of the oxides. We plan to use the Z-scheme for light absorbing. In the Z-scheme, there are two light absorbing materials. One of them is oxide. But the band gap of a typical oxide is too large. We plan to dope the oxide to reduce the band gap. We will thus calculate different doping site and formation energy, and their related electronic structures; (3) Electron and hole transports across the linker molecules. We will study the transport mechanism and help to design newer molecules for better connection; (4) Catalysis. There are multiple steps to study the catalysis process. It might involve high level quantum mechanical calculations, and ab initio molecular dynamics simulations (MD). It might be a combination of the classical and ab initio MD simulations; (5) Membrane. We can simulate the membrane and its mixing with the nanowires or rods using classical MD simulations, especially to study the proton transport. For example, one can study whether a proton can pass through a graphene sheet, or graphene with 5-7 ring defects; (6) Electrode (or semiconductor) liquid interface. One important thing is to study how the electron might transfer from the electrode to the liquid molecule (e.g., water) which induces the electrochemical interaction. This might involve a time-domain style simulation where the electron follows the time-dependent Schrodinger’s equation.

While the main purpose for the JCAP work is to find the materials and systems which work as artificial photosynthesis, during the work, in order to carry out some of the simulations, we will also develop some methods (e.g., methods for large scale material simulation, to study the catalysis process, and for time domain simulation). We will also develop methods taking the advantageous of the architecture of the NERSC machine.

High Fidelity Study of Multiscale Multiphase Lubricant Flow in High-Speed Gear Systems

Xiaoyi Li, United Technologies Corp Research Center

Associated NERSC Project: Multiscale/multiphase research for first-principles fuel atomization calculation (m945), Principal Investigator: Marco Arienti

NISE Award: 550,000 Hours
Award Date: March 2011

High-speed interlocked gear systems have a wide spectrum of applications across many business units in United Technologies Corporation, including Sikorsky, Pratt Whitney, Hamilton Sundstrand and Clipper. The geared transmission systems are also heavily used by automotive industry. In all the gear systems, sufficient oil supply must be delivered to the gears and bearings to provide adequate lubrication and cooling throughout the operating range. On the other hand, excessive supply can lead to higher power loss by friction among oil, air and gears. Therefore, studying the oil lubrication of gears is crucial for improving its durability as well as performance. The proposed study can help fundamentally understanding the complex physical processes in gear systems and potentially impact the design of such systems in many industrial products.

The lubrication processes in complex gear applications physically pose a challenging problem of multiphase oil-air flow interacting with complicated solid gear moving at high speed. No simple analytical/numerical models can give reasonable prediction of the involved processes except some high-cost direct simulation approaches based on first principles. In this work, we will directly solve the fundamental multiphase Navier-Stokes flow, using a coupled level-set and volume-of-fluid method to track oil flow and an embedded boundary approach to account for the motion of solid gears. The typical simulation domain is large containing large gears. However, resolving the details of oil introduction pose a stringent demand for high grid resolution at the oil-air interface. In addition, the high rotating speed of gears induces complex turbulent oil-gas flows that also have to be well resolved using high grid resolution. We have first applied adaptive mesh refinement (AMR) to manage the grid count and simulation speed within feasible range. A low resolution simulation at typical high gear speed has been demonstrated on a small-scale (64 cores) cluster at United Technologies Research Center. Here we propose to run a set of full-scale simulations at NERSC supercomputing facilities to do parametric exploration. The results would benefit the design of lubrication devices for many gear applications in industry.

Semiclassical Approaches for Clean Energy Resources

Jian Liu, University of California Berkeley

Associated NERSC Project: Semiclassical approaches for condensed matter dynamics and for molecular reaction dynamics (mp14)
Principal Investigator: William Miller, University of California Berkeley

NISE Award: 1,350,000 Hours
Award Date: March and June 2011

The research project described in this proposal is mainly aimed at developing new semiclassical methods and models to properly include quantum effects into classical molecular dynamics simulations of dense plasmas (e.g., fusion plasmas) and complex molecular systems. The research will also focus on the applications of these semiclassical approaches to specific problems of clean energy interests and water-related technologies. The accomplishment of this project will greatly advance our understandings of fundamental mechanisms of quantum effects in fusion plasmas and photochemical dynamics related to solar energy conversion; therefore will fulfill the mission of the Department of Energy.

We have made a significant advance in the development of more efficient semiclassical methodology that is capable of describing true quantum coherence and decoherence in the dynamics of large molecular systems. Semiclassical initial value representation (SC-IVR) methods take use of classical trajectories to describe quantum dynamics. A simple model is presented for treating local imaginary frequencies that are important in the study of quantum effects in chemical reactions and various dynamical processes in molecular liquids. It significantly extends the range of accuracy of conventional local harmonic approximations used in the linearized semiclassical initial value representation (LSC-IVR)/classical Wigner approximation for real time correlation functions. The key idea is realizing that a local Gaussian approximation (LGA) for the momentum distribution (from the Wigner function involving the Boltzmann operator) can be a good approximation even when a local harmonic approximation (LHA) for the potential energy surface fails. Along these lines, we have recently proposed three novel approaches for generating trajectory-based dynamics which conserves canonical distribution in the phase space formulation of quantum mechanics. They provide the framework to go beyond traditional quantum statistical potential approach to capture quantum effects in real time.

Applications of semiclassical methodologies to fusion plasmas will provide deeper insights on our understanding the dynamical properties of fusion plasmas. Currently we are collaborating with Lawrence Livermore National Laboratory to apply the semiclassical approaches to study energy fluxes in fusion plasmas to help us understand how energy is transferred between different modes and transported in the macro scope. We are also developing algorithms to improve the efficiency and scalability of the methodologies. For instance, when dealing with systems with a large number of particles on massively parallel computers, the scalability of the short- and long-range force terms is very different. Typically, 1,000 ~ 10,000 particles will be the standard size of the system that we are studying. We will divide the processes into two subsets.

The important role of liquid water in many areas of science from chemistry, physics, biology, geology to climate research, etc., has motivated numerous theoretical studies of its structure and dynamics. The significance of quantum effects on the properties of water, however, has not yet been fully resolved. In this paper we focus on quantum dynamical effects in liquid water based on the linearized semiclassical initial value representation (LSC-IVR) with a quantum version of the simple point charge/flexible (q-SPC/fw) model for the potential energy function. The infrared (IR) absorption spectrum and the translational diffusion constants have been obtained from the corresponding thermal correlation functions, and the effects of inter-molecular and intra-molecular correlations have been studied. Currently we are applying LSC-IVR to study liquid water using an ab-initio based, flexible, and polarizable force field. The system size consists of hundreds of molecules (or thousands of atoms). The computation also requires massively parallel computers.

First Principles Characterization of van der Waals Dispersion Interactions

Deyu Lu, Brookhaven National Lab

NISE project m1288

NISE Award: 350,000 Hours
Award Date: March 2011

The proposed NISE will deliver a theoretical tool (modified version of Quantum Espresso) based on first principles, which can be used to understand and characterize material structures at the atomic level. Since the underlying methodology is very general, this tool can be applied to systems with different dimensionalities and to systems exhibiting different level of complexities, such as novel nanomaterials for renewable energy related applications.

Van der Waals dispersion interaction plays an important role in determining the atomic structure of various materials, for example, molecular crystals and organic/inorganic interfaces with promising potential applications for molecular detection and renewable energy generation. Theoretically, it remains a challenging research frontier to develop efficient first principles methods to treat vdW dispersion interactions, since typically the key element, the non-local correlation effect, is not incorporated in many standard approaches, such as density functional theory (DFT) under the local and semi-local approximations.

In this project, we will develop new theoretical models and computer programs based on the adiabatic connection fluctuation dissipation theory, which treat non-local correlation effect rigorously by evaluating the dielectric response function. Currently, the dielectric response function is normally obtained under the random phase approximation (RPA), i.e., at the same level of time-dependent Hartree. Although RPA leads to significant improvement over DFT, the absolute correlation energies in RPA are overestimated substantially, and the cohesive energies are slightly underestimated. Our goal is to address the above issues by introducing a non-local exchange-correlation kernel, which is neglected in RPA.

Previous model exchange-correlation kernels were mostly derived in the context of homogeneous electron gas (HEG). A few extensions have made to treat inhomogeneous systems. However, these methods either are computationally too demanding [1, 2] or violate some physical constrains seriously. Our goal will be achieved in two phases. First, we will generalize previous exchange-correlation models to inhomogeneous systems that can be computed efficiently and also obey proper physical constrains. Here, special attention will be made to design an Ansatz, where real space and reciprocal space variables are fully separated in order to achieve an efficient numerical implementation. Secondly, benchmark calculations will be carried out on bulk silicon and rare gas dimers to evaluate the performance of various model exchange-correlation kernel.

The success of the proposed NISE is strongly tied to the PI's expertise in studying vdW interactions. The PI is a main developer of a first principles RPA program that exhibits the best scaling performance, which is suitable for large-scale parallel computing [4,5]. This program is based on a modified version of Quantum Espresso, and has been used to study vdW interactions in a wide range of materials, including bulk silicon, rare gas dimers, molecular crystals, and self-assembled organic monolayers [5 - 7]. Our new model exchange-correlation kernel has been implemented to the RPA code, and preliminary results are promising. Work is in progress to refine the analytical form of our model to better reproduce the known results for HEG and to perform benchmark calculations for prototypical inhomogeneous systems.

Simulating ETG Plasma Turbulence with Impurities

David Mikkelsen, Princeton Plasma Physics Lab

Associated NERSC Project: Experimental Tests of Gyrokinetic Simulations of Microturbulence )m64)

NISE Award: 3,000,000 Hours
Award Date: June 2011

Understanding plasma turbulence, particularly electron plasma turbulence, is a key step in the development of efficient nuclear fusion reactors. Nuclear fusion promises to be an environmentally friendly and plentiful energy source, but the performance of reactors is diminished as the plasma they contain becomes turbulent. One possible method to control plasma turbulence is through the magnetic configuration. Reversing its shear is known to experimentally reduce turbulence and trigger high plasma performance. If we can gain a theoretical understanding of this mechanism, we are one step closer to a new energy source.

Recent simulations of plasma turbulence in the National Spherical Torus Experiment demonstrate that magnetic shear can raise the nonlinear threshold for electron temperature gradient (ETG) driven turbulence. This effect may explain the observation of electron internal transport barriers in these plasmas, a regime of enhanced thermal confinement known to be triggered experimentally by reversing the magnetic shear in the plasma.

However, these simulations, while done with realistic majority ion plasma densities, do not take into account the relatively lower density of impurity ions. These could be important because the major impurity in NSTX is carbon, which, thanks to its high charge, significantly raises the effective Z value of the plasma. In the ‘adiabatic ion’ approximation raising Zeff is known to linearly reduce the drive of ETG turbulence, so including carbon should alter the linear threshold for plasma turbulent transport. The present nonlinear simulations include a kinetic treatment of ions because this is known to be important to achieve realistic turbulence saturation levels; the net impact of adding an impurity ion to the dynamics is unknown.

Previously, there have been no nonlinear simulations of ETG turbulence that include multiple kinetic ion species, as simulations that accurately account for even a single kinetic ion species are already expensive (~ 100,000 cpu hours). We need to add impurities in order to obtain realistic simulations of the effects of magnetic shear on nonlinear simulations of ETG turbulence in NSTX. To do so requires scanning electron temperature gradients with strongly reversed shear using electrons, deuterium and carbon at realistic density levels.

Coupled Electronic and Nuclear Dynamics in Enzymes and Photocatalysts

Thomas Miller, California Institute of Technology

Associated NERSC Project: Sampling diffusive dynamics on long timescales, and simulating the coupled dynamics of electrons and nuclei (m822)

NISE Award: 750,000 Hours
Award Date: March 2011

We aim to design and employ novel simulation techniques to provide crucial insights and new methodologies that further the understanding of fundamental chemical processes, which are relevant to the chemistries of energy and health. These simulations will yield breakthroughs in our fundamental understanding of the reaction dynamics of energy-related chemical processes.

Charge transfer processes, including electron transfer, hydrogen/proton/hydride transfer and proton-coupled electron transfer (PCET) reactions, are central to the chemistry of energy conversion, respiration, and enzyme kinetics. But key aspects of such reactions remain poorly understood due to the coupling of intrinsically quantum motions to the slower, classical motions of the surrounding environment. We propose to employ the large-scale NERSC computational resources to perform direct simulations of these processes to reveal the detailed mechanisms and the nature of the dynamic coupling between the environment and the transferred quantum particle(s). These simulations will yield breakthroughs in our fundamental understanding of the reaction dynamics of energy-related chemical processes, a key NERSC research goal. In particular, it will open the door for future simulation studies of PCET reactions in energy-related materials and enzymes.

The specific research objectives of this proposal are (1) to extend path-integral simulation techniques to study dynamical coupling between the reaction center and enzyme environment in the hydride transfer reaction catalyzed by dihydrofolate reductase, and (2) to discover the mechanism and to explain the kinetics of a prototypical PCET reaction in biomimetic inorganic chemistry, namely the transfer of a proton and electron in the tyrosine reduction step of photosystem II (PSII). These systems pose outstanding mechanistic questions that demand detailed simulation studies.

The Miller group has recently extended the ring polymer molecular dynamics (RPMD) method to directly simulate the coupled dynamics of electrons and nuclei in complex systems. The RPMD method uses the Feynman path integral formulation of statistical mechanics to map quantum mechanical particles onto an isomorphic classical mechanical system. In a series of proof-of-principle studies, we have demonstrated the model provides an accurate description of excess electron diffusion in liquids, the dynamics of electron localization and trapping following high-energy injection, and the dynamics and kinetics of electron transfer between explicitly solvated metal ions. By combining RPMD with conventional classical molecular dynamics technology, as well as well-established rare-event sampling methods such as transition path sampling and transition interface sampling methods, we have demonstrated that the massively parallel computational resources can be efficiently utilized to extract the mechanism and kinetics of reactions involving coupled electronic and nuclear degrees of freedom.

RPMD is a method for describing the dynamics of a quantum mechanical system, given an underlying potential energy surface. In the proposed simulation studies, we will employ conventional atomistic force-fields (such as SPC/E and CHARMM) for the interactions among solvent molecules and for interactions within the enzyme or inorganic compounds. We will employ previously developed one-electron pseudopotentials for the interactions of the transferring electron with the solvent environment and the metal centers on the inorganic compounds, and we will employ the empirical valence bond method or analytical potentials to describe the interactions of the transferring proton with its donor and acceptor atoms.

Given adequate computational resources such as those available on the NERSC systems, we are ready to begin using our new methodology to investigate breakthrough applications, such as those described above. The NERSC Initiative for Scientific Exploration award is thus instrumental to the elucidation of biological and synthetic pathways for solar energy conversion and catalysis.

Proton-coupled electron transfer in photocatalysis and protein translocation in biosynthesis: Bridging lengthscales and timescales in molecular simulation

Thomas Miller, California Institute of Technology

Associated NERSC Project: Sampling diffusive dynamics on long timescales, and simulating the coupled dynamics of electrons and nuclei (m822)

NISE Award: 10,000,000 Hours
Award Date: June 2011

We aim to develop and employ novel simulation techniques to advance the understanding of photosynthetic water splitting and protein targeting in cells. Key applications of this research include the chemistries of energy and health.

Research Aim 1. Proton-coupled electron transfer in enzyme and biomimetic catalysis

Proton-coupled electron transfer (PCET) reactions are central to the chemistry of energy conversion, respiration, and enzyme kinetics. But key aspects of the kinetics and mechanism of PCET remain poorly understood due to the coupling of intrinsically quantum motions to the slower, classical motions of the surrounding environment. We propose to employ the large-scale NERSC computational resources to perform direct simulations of these processes to reveal the detailed mechanisms and the nature of the dynamic coupling between the environment and the transferred quantum particle(s). In particular, we propose path-integral molecular dynamics simulation studies of PCET dynamics in solvated iron bi-imidizoline complexes, which are a key prototype for bioinorganic catalysis and an important model system for understanding solar photocatalytic water splitting in photosystem II.

The Miller group has recently extended the ring polymer molecular dynamics (RPMD) method to directly simulate the coupled dynamics of electrons and nuclei in complex systems. The RPMD method uses the Feynman path integral formulation of statistical mechanics to map quantum mechanical particles onto an isomorphic classical mechanical system. We have demonstrated the model provides an accurate description of excess electron diffusion in liquids, the dynamics of electron localization and trapping following high-energy injection, the dynamics and kinetics of electron transfer between explicitly solvated metal ions, and the dynamics of hydride-transfer in the dihydrofolate reductase enzyme with over 14,000 atoms. By combining RPMD with conventional classical molecular dynamics technology, including rare-event sampling methods such as transition path sampling and transition interface sampling, we have demonstrated that massively parallel computational resources can be efficiently utilized to extract the mechanism and kinetics of reactions involving coupled electronic and nuclear degrees of freedom. The proposed RPMD studies of iron bi-imidizoline complexes will yield breakthroughs in our understanding of the PCET mechanism by providing the first direct simulations of PCET in a molecular system.

Research Aim 2. Sec-facilitated protein translocation and membrane integration

The Sec translocon is the central molecular component in protein targeting pathways that encompass all kingdoms of life. It is a hetero-trimeric complex of membrane-bound proteins that forms a passive channel for post-translational and co-translational protein translocation, as well as the co-translational integration of proteins into the phospholipid bilayer. The translocon operates in collaboration with molecular motors that provide the driving force for insertion of peptides into the channel. In the co-translational pathway, the peptide substrate is driven into the translocon by the bound ribosome, and in the post-translational pathway for bacteria, this role is performed by the SecA motor. Structural, biochemical, genetic, and computational studies indicate that the translocon undergoes large-scale conformational changes during both peptide secretion and membrane integration, and growing evidence suggests that SecA similarly undergoes major conformational changes to couple the hydrolysis of ATP to the peptide insertion driving force. However, basic mechanistic features of these most fundamental cellular processes remain poorly understood. The mechanism and regulation of the translocon is governed by protein-protein interactions and slow conformational changes. Critical aspects of the translocon regulation mechanism are not possible to address without simulations that span the slowest relaxation timescales for the channel/motor/substrate complex. We propose the use of large-scale NERSC simulations to address two such aspects. First, we have developed a new coarse-grained simulation approach that enables us to span the timescales for direct simulation of protein translocation and membrane integration via the Sec translocon; we propose to use this coarse-grained approach to discover the molecular origin for the dependence of translocon regulation on substrate sequence and charge distribution. Secondly, we propose atomistic simulations using enhanced sampling techniques to accurately characterize the free energy landscape for the translocon and to predict translocon and substrate mutations that will inhibit and re-direct the cellular pathways for protein secretion and membrane integration.

With adequate computational resources, such as those available on the NERSC systems, we are ready to begin the critical applications described above. The NERSC Initiative for Scientific Exploration award will be instrumental in our efforts to elucidate biological and synthetic pathways for transport and catalysis.

Three Dimensional Models for Martensitic Phase Transformations

Benson Muite, University of Michigan

NISE project m1116

NISE Award: 200,000 Hours
Award Date: March 2011

We propose to build and test a model for nonlinear solid continua used in shape memory alloys (smart materials). Smart materials are used in among other places actuators, eye glass frames, disk drive arms, and in other electromagnetical devices. At present there are no good models for the dynamic behavior of these materials. We would like to change this.

We would like to build and test a three dimensional pseudospectral code for martensitic phase transformations. In two dimensions, we have already found and explained interesting features not captured by other models. To perform three dimensional simulations, we need in excess of 6144^3 grid points and memory requirements necessitate the use of a supercomputer along with data storage and visualization facilities such as those provided by NERSC.

Molecular Simulations of the Release of Phosphate and ADP from F1-ATPase to Aid in the Understanding of the Function of F1-ATPase

Victor Ovchinnikov, Massachusetts Institute of Technology

Associated NERSC Project: Understanding the function of protein and nucleic acid components of cellular machines necessary for optimizing biofuel production (gc8), Principal Investigator: Martin Karplus

NISE Award: 1,000,000 Hours
Award Date: March 2011

The focus of this proposal is to increase our understanding of the mechanism of phosphate and ADP release and the steps where they are released during the function of F1-ATPase. The understanding of this key step will ultimately allow us to understand the entire mechanism of chemomechanical coupling of F0F1-ATPase. In addition, the results will aid in modifying the systems we study to make them more useful in the biological production of energy.

The objective of the proposed study is to provide a complete model of F1-ATPase (the catalytic moiety of F0F1-ATPase) action, which is essential to understand the motor mechanism of F0F1-ATPase. Recent experiments suggest that the release of phosphate (Pi), one of the products of ATP hydrolysis, from one of the active sites generates a significant portion of the torque to drive the γ-stalk rotation of F1-ATPase. However, it is not known which of the three nonequivalent subunits of F1-ATPase is the source of ADP and Pi.

We propose to investigate the mechanisms and pathways of the release of Pi and ADP from the β-subunit in different conformations of F1-ATPase. To overcome the time scale of the actual Pi (and ADP) release, multicopy enhanced sampling (MCES) (1) simulations will be used to accelerate the release process. The simulations will be repeated multiple times to obtain sufficient statistics, where each of the simulations will start from different distribution of initial velocities. The multiple simulations will be carried out by using multiple MPI communicators that we have implemented recently in the CHARMM program. Since the multiple simulations do not require extensive parallel communications between each simulation, we expect near perfect parallel efficiency. After finding the Pi (and ADP) release pathways, the string method (2) will be applied to compute the free energy along the pathway.

Turbulence over Complex Terrain: A Wind Energy Perspective

Edward Patton, National Center for Atmospheric Research

NISE project m917

NISE Award: 5,500,000 Hours
Award Date: January and June 2011

This poject involves the investigation of turbulence interacting with orography, both with and without vegetation. In the recommendations from a recent DOE sponsored workshop on wind resource characterization (Schreck et al, 2008), canopy turbulence and the interaction with orography were noted as critical aspects hindering the wind energy industry and its efforts to meet government mandated goals of 20% US energy coming from renewables before 2020. Vegetation interacts with turbulence in orography in complex ways that are only now beginning to be understood. Recently, it has become known that the pressure drag imposed on the flow by the trees can interact with the orographically induced pressure drag in ways that shift the pressure minima downstream of the hill crest compared to flow over a hill of equivalent roughness but with no resolved canopy. This shift in the pressure minima essentially makes the hill appear more steep to the flow than it otherwise would, and can induce separation on the lee side of terrain that would not be expected with current theories. This turbulence generation mechanism could be extremely damaging for wind turbines on subsequent ridges. Proper characterization of this process is essential for wind turbine deployment strategy and for turbine design capable to withstand these environments. Further investigation is required into the relationship between hill length, canopy height, and canopy density variations before we can develop parameterizations that account for these essential processes codes generating inflow conditions used in turbine load testing and also for predicting the impact of wind farms on downstream climate.

Wind turbines are frequently deployed in regions of undulating topography to take advantage of the expected speed-up of wind as the atmosphere is forced up over the hill. While this reasoning is quite simple for an idealized isolated hill in a non-stratified, non-vegetated environment, vegetation and stratification interacts with turbulence in orography in complex ways that are not yet clearly established. Recently, it has become known that the pressure drag imposed on the flow by the trees can interact with the orographically induced pressure drag in ways that shift the pressure minima downstream of the hill crest compared to flow over a hill of equivalent roughness but with no resolved canopy. This shift in the pressure minima essentially makes the hill appear more steep to the flow than it otherwise would, and can induce separation on the lee side of terrain that would not be expected with current theories. This turbulence generation mechanism could be extremely damaging for wind turbines on subsequent ridges. Proper characterization of this process is essential for wind turbine deployment strategy and for turbine design capable to withstand these environments. Further investigation is required into the relationship between hill length, canopy height, and canopy density variations before we can develop parameterizations that account for these essential processes codes generating inflow conditions used in turbine load testing and also for predicting the impact of wind farms on downstream climate. We intend to evaluate turbulence length scales and how they vary across these parameter space variations which will guide our understanding of the mechanisms generating the turbulence and our ability to introduce the influence of vegetation on turbulence in typical terrain for turbine deployment.

Performance Enhancements for Three-Dimensional Fast Fourier Transforms Library P3DFFT

Dmitry Pekurovsky, University of California San Diego

NISE project m1243.

NISE Award: 350,000 Hours
Award Date: March 2011

Three-dimensional Fast Fourier Transforms are used in a wide range of scientific computing applications, such as DNS turbulence, astrophysics, material science, oceanography, 3D tomography. They are often the bottleneck for performance of such codes on petascale systems. Any performance improvements in P3DFFT will translate directly into benefits for many users in these fields, since this open-source library is well-known and used as a building block in cutting-edge research codes.

P3DFFT is an open-source library for ultra-scalable FFTs in three dimensions ( It is used in many applications designed for large-scale simulations on Peta-scale systems. The library has demonstrated excellent scalability on up to O(10^5) cores on systems such as Jaguar and IBM BG.

This project will be focused on further improving performance and scalability of P3DFFT, putting it in a position to take advantage of the latest technological advances in supercomputer design. This work will follow two main directions.

  1. Currently P3DFFT relies on MPI_Alltoall(v) for a crucial global transpose operation. This is a blocking operation and does not allow any overlap of communication with computation.Such overlap will be explored, through use of one-sided communication (via MPI-2 or SHMEM).
  2. The multicore design of Hopper, as well as most other petascale platforms, necessitates adaptation of the default standard MPI programming model. In particular, a hybrid MPI/OpenMP implementation will be developed and tested,in order to make sure the nodes and interconnect are used in an optimal way.

Accelerated Materials Design Towards the Materials Genome

Kristin Persson, Lawrence Berkeley National Laboratory

Associated NERSC Project: The Materials Genome (matgen)

NISE Award: 2,500,000 Hours
Award Date: March 2011

One of the foremost reasons for the long process time in materials discovery is the lack of comprehensive knowledge about materials, organized for easy analysis and rational design. The goal of the proposed work is to accelerate the way materials discovery is done by creating a high-throughput computing environment together with a searchable, interactive database of computed materials properties. The idea is driven by our urgent need fornew materials to realize an economy based on renewable energies. Today, it takes 18 years on average from materials conception to commercialization. Unless we implement new approaches to the materials discovery and development process, we cannot expect to have a significant impact on our energy economy in the 20-30 year time frame needed to address the climate issues.

Materials innovation today is largely done by intuition and based on the experience of single investigators. Google has demonstrated the value of organizing data and making the data available and searchable to large communities. We propose to leverage the information age for materials using the only tool that can efficiently scan multiple properties for all known materials in a reasonable amount of time: computations. First-principles calculations have reached the point of accuracy where many materials properties, relevant for photovoltaics, batteries, and thermoelectrics can be reliably predicted. Most importantly, computational approaches are by the nature of their scalability and objectivity the most efficient and consistent way of scanning Nature in search for optimal material solutions. Once computed, we will store and organize the data so that it may be readily accessed for data mining and statistical learning algorithms. This extensive knowledge of materials’ properties, linked to their structure and chemistry, will enable us to invert the design problem and specify the needed arrangement of atoms and how it can be made to target desired properties and constraints.

Triple Photoionization of the Li atom

Mitch Pindzola, Auburn University

Associated NERSC Project: Computational Atomic Physics for Fusion Energy (m41)

NISE Award: 3,000,000 Hours
Award Date: June 2011

We propose to calculate the energy and angle differential cross sections for the triple photoionization of the Li atom using a modified version of the Time-Dependent Close-Coupling (TDCC) - 3D code, which has been successfully applied previously to calculate total and energy differential cross sections. We will modify the TDCC-3D code to do domain decomposition of a (384)^3 numerical lattice over (24)^3 to (48)^3 cores. Construction of the final coordinate space wavefunction is dominated by message passing between nearest neighbor cores. Construction of the final momentum space wavefunctions is dominated by reduction over all cores. We plan to try new numerical methods to help speed up the reduction part. Besides changing the numerical method used in the projection part, we also plan to investigate using only MPI between processors and open MP between cores.

Implementation of C-MAD Parallel Code to Include New Physics and Capabilities for Future Particle Colliders

Mauro Pivi, Stanford Linear Accelerator Center

Associated NERSC Project: Beam dynamics issues in the new high-intensity particle colliders and storage rings (mp93)
Principal Investigator: Miguel Furman, Stanford Linear Accelerator Center

NISE Award: 300,000 Hours
Award Date: March 2011

A novel single-bunch broadband feedback system is considered for the LHC complex at CERN to suppress the electron cloud instability. Upon successful testing of a prototype system, a full scale feedback system will be installed at CERN. Simulations are crucial to assess the feasibility and to optimize the design of the system.

Furthermore, a particular concern for meeting the emittance specifications of the Linear Colliders ILC and CLIC damping ring is the possibility of emittance growth (i.e. ~beam size) occurring in the presence of even very low electron cloud densities. Recent simulations and measurements suggest that this effect may be significant limitation to reaching the colliders performance. While considerable work remains to precisely quantify this issue, initial results suggest that the acceptable cloud densities may need to be lowered by several factors.

We are in the process of implementing the existing parallel code C-MAD (M. Pivi SLAC) used for simulations of beam instabilities in particle accelerators by incorporating new physics such as radiation damping and quantum excitiation, a single-bunch feedback system and intra-beam scattering routines. This will allow to understand the physics of particle instabilities at the LHC and future colliders such as the Linear Colliders and Super-B.

We would also like to make use of parallel FFT routines to resolve the Poisson equation for many particle system. In doing so, we could increase the number processors for the computation by a large factor. We will benchmark the simulations with the data from the CesrTA test accelerator in operation at Cornell university NY.

Morphology of Young Core-Collapse Supernova Remnants from Multi-Physics Three-Dimensional Simulations

Tomasz Plewa, Florida State University

Associated NERSC Project: Supernova Explosions in Three Dimensions (m461)

NISE Award: 1,000,000 Hours
Award Date: March 2011

The project will provide data for astrophysics, and highlight current problems in computational hydrodynamics, multiphysics modeling, and high performance computing. It will offer exploratory data for visualization and analytics. Our application results, and the results of our verification and benchmarks, will be fully documented and made publicly available laying a foundation for future collaborations and stimulating the research of our competitors. The results of this project are highly relevant for our high-energy density physics supernova experiment to be executed at the National Ignition Facility.

We propose to expand the research topics of our group into the field of core-collapse supernovae. We have extensive experience in modeling core-collapse supernova explosions, both in modeling the explosion itself (about 1 second after the core bounce) and the early interaction between supernova shock and the progenitor envelope (a few hours after the core bounce). Until now such studies were done using at least two different codes, one for the explosion and another for the later evolution. Furthermore, the explosion phase is almost exclusively modeled in spherical geometry in 3D or by assuming axisymmetry in 2D. This inherently poses a problem at the symmetry axis resulting in significant and hard to quantify discretization errors. This results in having to treat the region near the symmetry axis in a special way, in particular excluding this region from computations. The role of such modifications is very difficult to assess, and it is conceivable they affect the explosion process.  Finally, the central region of the mesh is excluded from models due to time-step restrictions imposed in spherical geometry near the origin (where the neutron star sits). This prevents obtaining self-consistent models that would account for the evolution of the neutron star and its interaction with the post-shock region.

In this research study, we will address some of the above issues. We propose to study the supernova explosion within a single simulation using a single code, starting with the onset of neutrino driven convection until the expansion of stellar ejecta is homologous (several weeks after explosion).  One element required to validate the supernova explosion models is to obtain model observables. For supernovae, observations almost exclusively cover epochs much later (weeks, young supernova remnant phase) than the explosion itself (seconds). This motivates our goal of adapting the hybrid characteristic ray tracing algorithm (HCRT) for photons to neutrino transport in core-collapse supernovae (ccSNe). Such a method also provides a much more realistic treatment of energy deposition, as current non-radiation hydrodynamics (non-RHD) models approximate the effects by assuming an energy deposition distribution. This approach also provides a computationally feasible alternative to Boltzmann neutrino transport models which are on the order of 100 times more expensive. The HCRT method uses parameterized neutrino fluxes as the boundary conditions for the neutrino transport which is expected to yield accurate results as long as the description of neutrino physics does not severely change the ratio of neutrino heating timescale to the timescale of participating hydrodynamic instabilities (e.g. standing accretion shock instability, neutrino driven convection). These simplifications enable parameter studies of supernova explosions in 3D while maintaining a realistic treatment of the physics involved. In addition, implementing HCRT on adaptive Cartesian meshes will allow us to mitigate domain selection choices that have potentially biased previous simulations. By following this path we will also be enabling future studies with complex geometric configurations and neutrino source distributions.

By limiting the assumptions about the progenitor star (in contrast to non-RHD) we will be able to explore possible links between progenitor structure, the explosion mechanism, and young supernova remnant morphology, therefore enabling validation of our models. In particular, we will investigate the SN remnant (SNR) morphology dependence on the rate of stellar core rotation and its structure. This study will supplement the group’s current SN-motivated experimental project at the National Ignition Facility. This project will be the sole focus of one doctoral student with extensive practical experience with our supernova code and the NERSC facility (Timothy Handy).

This implementation will be designed for high performance, distributed memory systems. The HCRT algorithm scales up to 512 cores, and we expect our initial implementation will achieve this level of performance. We will conduct a detailed analysis of parallel performance, identify bottlenecks, improve scalability, and enable computing models with several thousands of processors. We are also interested in exploring speeding up calculations using accelerators (GPUs, FPGAs, etc.) in an effort to prepare for exascale platforms.

Large High-Resolution Galaxy Simulation Program

Joel Primack, University of California - Santa Cruz

Associated NERSC Project: Galaxy Formation Simulations (mp363)

NISE Award: 1,500,000 Hours
Award Date: June 2011

A combination of observations with Hubble Space Telescope and simulations on NERSC's largest system Hopper-II will allow us to resolve many mysteries of galaxy formation. One of the key issues is the formation of bulges of spiral galaxies and the formation of elliptical galaxies. Current theory (mostly hydro simulations) has two competing scenarios. Bulges can be formed either in large mergers of galaxies or through violent instabilities developing in disk galaxies in the early Universe. With few high-quality simulations available so far (about a dozen per year by different groups) it is difficult to interpret observations and resolve this critial issue. The situation will be different once we simulate hundreds of galaxies with different masses, merging histories, and environments.

The Cosmic Assembly Near-IR Deep Extragalactic Legacy Survey (CANDELS) is a very large (over 900 orbits) new Hubble Space Telescope (HST) observational program to study the the first third of galactic evolution from redshift $z = 8$ to 1.5 (0.6 to 4.5 billion years after the Big Bang) via deep imaging of more than 250,000 galaxies exploiting the new WFC3 camera on HST. Five premier multi-wavelength sky regions are targeted; each has multi-wavelength data and also extensive spectroscopy of the brighter galaxies. We are participants in CANDELS, and this NISE proposal intends to complement these new observations with a large suite of hydro plus N-body cosmological simulations of galaxies that include the numerous physical processes involved: hydrodynamics, radiative atomic and molecular cooling, stellar feedback, metal enrichment, radiation pressure, and the fueling of and feedback from supermassive black holes. Development of this simulation program was started on the NERSC Bassi system, as described and illustrated in the article ``Baby Brutes'' in the NERSC Annual Report 2008-2009, pp. 64-68. We have continued developing our codes on the NASA Ames Schirra and Pleiades systems. Several improvements have now been added, and the simulations are now even more realistic.

The many 24-core nodes on Hopper-II will allow us to do mass production of extremely high-resolution galaxy simulations, which was unthinkable even a few years ago. We propose to simulate a statistical sample of 300 galaxies, a factor of 20 more than we have at present. This will be by far the best set of cosmological simulations of galaxy formation. With hundreds of galaxies simulated we can also explore numerous issues that were difficult to address before, such as the effects of environment, the frequency and duration of major mergers, and the formation of spiral galaxy bulges and of elliptical galaxies. We now have tools -- including our premier radiative transfer code SUNRISE -- to produce the same quantities that we get from observations: realistic images in many wavebands including the effects of dust, spectral energy distributions, stellar masses and ages, and kinematics. This allows a direct comparison of the numerical simulations with astronomical observations.

Towards High Performance Urban Canopy Flows

Joe Prusa, Teraflux Corporation

Associated NERSC Project: Continuous Dynamic Grid Adaptation in a Global Atmospheric Model (m612)
Principal Investigator: William Gutowski, Iowa State University

NISE Award: 400,000 Hours
Award Date: March and June 2011

Advanced and flexible modeling capabilities that enable the simulation of polliution transport and dispersion processes possible in complex terrain and urban environments are of interest because the potential effect of chemical and biological agent release on the population within the cities is still a serious concern.

The aim of this proposal is to advance an efficient and scalable approach for pollution transport and dispersion (T&D) within complex urban environments. The urban canopy introduces strong multiscale inhomogeneities spanning spatial scales from sub-meter to kilometers and beyond.

The specific nature of the T&D problem requires relatively high resolution for the simulations to resolve small scale turbulence near the buildings and the effectively long time scale required to capture transport of the pollutants in the full extent of real cities. An Immersed Boundary (IMB) method is used to represent building effects in the model grid. Transport of pollutants is done with either passive tracer or a heavy particle approach which in practice uses a computationally demanding bin model to represent a distribution of particle properties.

In this project we plan to use the multi-scale multi-physics model EULAG which has already been demonstrated to work efficiently upwards of thousands of processors on a variety of high-end architectures. Because urban canopy problems require significant spatial resolution and large domain scales, they are computationally demanding and require extremely efficient use of the largest available resources. Therefore we intend to advance the performance of the current model algorithms in the range of 20K to 100K processor cores, especially paying attention to increasing the peak performance while maintaining high model scalability.

Evaluating PGAS scientific graph analysis codes on the Gemini interconnect

Jason Riedy, Georgia Institute of Technology

Associated NERSC Project: Linear Algebra Algorithms on High Performance Computers (mp156)
Principal Investigator: James Demmel, University of California Berkeley

NISE Award: 250,000 Hours
Award Date: June 2011

Processing graph-structured data is central in many applications ranging from disease tracking through social networks to financial regulatory monitoring. Graph-structured analysis is an increasingly important performance bottleneck even in traditional scientific simulation codes. Current common programming systems for high-performance computers place building and optimizing graph analysis codes in the hands of the few, dedicated experts who still have a high tolerance for mental pain and anguish. We are expanding existing practice and evaluating new approaches for emerging programming systems and computing systems more amenable to high-performance graph analysis.

We propose porting our MPI-based, memory scalable weighted bipartite matching code to UPC and evaluating its performance on Hopper's Gemini interconnect. This memory-scalable code computes a static pivoting order used in SuperLU's scalable unsymmetric LU factorization.

The current MPI implementation is saddled with extra communication for coordinating semi-asynchronous tasks and also uses extra memory for explicitly maintaining local views of global information. Our matching algorithm is based on an iterative auction algorithm. The extra MPI communication calls needed as the problem reduces in size and moves across the graph (sparse matrix) defeats performance scalability, leaving our matcher only memory scalable in general.

UPC's more flexible memory model permits easier coordination as the problem size and active location changes. We expect at least a moderate constant factor improvement in performance on long-tailed iterations (e.g. on the sparse matrix av41092). UPC's relaxed memory consistency over the fine-grained Gemini network may reduce the communication and synchronization throughout our semi-asynchronous algorithm. Auction prices rise monotonically and are tolerant to different views during computation so long as only the best price is accepted. Using more fine-grained memory access will permit working with less memory overhead per process, enhancing our current memory scalability.

Naturally, these are ideals that may not be achieved by all UPC implementations. Investigating these on an architecture like the XE6 that supports distributed PGAS operations at a low level helps evaluate what needs improved in UPC implementations.

For base performance modeling evaluation, we also propose to port our existing microbenchmarks and implement a simple Graph500 graph traversal implementation suitable for use as a reference code. The bipartite matching algorithm, an auction algorithm, resists straight-forward performance characterization. Having some predictable baseline results helps evaluate our matching code's performance optimizations.

Representing Convective Clouds in Global Circulation Models

David Romps, Lawrence Berkeley National Lab

Associated NERSC Project: Interactions of Clouds, Convection, and Climate (m1196)
Principal Investigator: William Collins, Lawrence Berkeley National Lab

NISE Award: 1,000,000 Hours
Award Date: March 2011

Convective clouds are very difficult to represent in GCMs in part because of their tendency to cluster together. This tendency -- called convective organization -- is influenced by the way in which clouds modify their local environment by changing the local radiation budget, surface fluxes, and atmospheric humidity. This last factor has not been quantified, and it is the goal of this project to do so.

This project will use the cloud-resolving model called Das Atmosphaerische Modell (DAM) to study the two-way interactions between small-scale clouds and the atmosphere's large-scale circulation and climate. DAM has been used extensively for the past three years on a variety of platforms, including Harvard's 10k-core Odyssey cluster and NASA's Columbia and Pleiades supercomputers.

Simulation of Inelastic Decoherence in Electronic Transport Through Nanoscale Structures

Sayeef Salahuddin, University of California - Berkeley

Associated NERSC Project: Massively parallel quantum transport simulation of nano scale electronic devices for ultra low power computing (m946)

NISE Award: 750,000 Hours
Award Date: March 2011

Inelastic scattering, where an electron can gain or loose energy, significantly affects the way electrons move through a nanostructure. Many experiments show that inelastic scattering presents a serious bottleneck in the efficiency for solar cells and thermoelectric devices. However, modeling inelastic scattering properly in a nanostructure while also capturing the quantum effects has been intractable due to significant numerical burden. In this research, we shall attempt to use physical symmetry of the problem and numerical algorithms to massively parallelize the simulation methodology so that simulation of decoherence in real structures can eventually be performed.

Inelastic scattering significantly affects the way electrons move through a nanostructure. Many experiments show that inelastic scattering presents a serious bottleneck in the efficiency for solar cells and thermoelectric devices. However, modeling inelastic scattering properly in a nanostructure presents significant challenge. This is because, within a full quantum transport description, which is necessary to account for quantum effects in a nanostructure, an inelastic scattering couples all the energy levels in the system. In addition, the broadening of states that ensue from such decoherence needs to be solved self-consistently with transport. This is why simulation of atomistic inelastic decoherence in electronic transport for realistic devices is often considered to be impossible.

We propose to massively parallelize the decoherence calculation within this proposal so that realistic structures can eventually be simulated. The key idea stems from the fact that, only a few phonon/photon frequencies are coupled through inelastic decoherence. Therefore, we shall try to use the OpenMP to take advantage of the multi-core nodes where each node will take care of the neighboring energy states in the calculation. A second way the parallelization process will work is to take advantage of the need for self-consistency. All the relevant energies will be parallelized to individual nodes where each node will parallelize the coupling between neighboring frequencies. Additionally, each iteration will use the previous iteration as an input.

In summary, this effort, if successful, will enable simulation of electronic transport in presence of inelastic decoherence in realistic structures beyond the state of the art. Since such decoherence mechanism is critical to the physics of all sorts of solar energy conversion and thermoelectric devices, the practical implication could be significant. From an implementation point of view, this work will (i) explore new physics and numerical algorithms (ii) investigate if OpenMP could enhance the performance of such simulations at the node level and (iii) also enhance the performance of an existing code developed with the support of a previous NISE award. Notably, this code currently has a scaling over 8192 cores. We have already used this code to perform state of the art simulations that generated multiple publications in 2010 including an issue cover story from Applied Physics Letters. If the OpenMP implementation proposed in this request is successful, we expect the scaling to improve at least by one order of magnitude.

Improving Scalability of an Eigensolver for the Kohn-Sham Electronic Structure Problem

Grady Schofield, University of Texas at Austin

Associated NERSC Project: Computational Study of Electronic and Structural Properties and Functionalization of Nanostructures (m433)
Principal Investigator: James Chelikowsky, University of Texas at Austin

NISE Award: 1,000,000 Hours
Award Date: March 2011

In the Kohn-Sham electronic structure problem, the electron density of a system of atoms is computed from the eigenvectors of the Kohn-Sham equation where for each valence electron in the system an eigenvector must be computed. For large systems, this makes it necessary to form an approximation subspace that is composed of many vectors. The algorithm in our code, called PARSEC, that forms the approximation space creates what is essentially a type of Krylov subspace. We have noticed that even for moderately sized systems, beginning the iterative process with a group of vectors, as opposed to only a single vector, does not degrade the performance of the algorithm in the sense that the final approximating subspace does not need to be substantially larger than that of space created with only a single starting vector. In fact, in the case where the potentials in the Kohn-Sham equation induce nearly degenerate states, i.e. states where eigenvalues agree to ten or so digits which occur in systems with a high degree of symmetry ( crystals for instance ), beginning the subspace creation process with multiple starting vectors improves robustness of the algorithm for resolving the nearly degenerate subspaces to a high degree of accuracy.

In the solution of the eigenproblem, we are interested in a sufficiently large number of the eigenpairs with the smallest eigenvalues. To create the approximation subspace for these eigenvectors, we apply an operator that enhances the eigencomponents of a vector in the low end of the spectrum. This operator is a high degree polynomial in the Hamiltonian matrix, whose application to a vector is computationally intensive. Trying to decrease the time to solution by adding more processors to a given job is eventually hindered by the communication overhead associated with the matrix vector multiplication routine. As long as system size is scaled up with the number of processor cores, PARSEC scales well; and decreasing the time to solution by increasing the number of processors works well. However there are certain types of applications where the code must run for a very long period of time on a relatively small system which leads to a small Hamiltonian. This is the case in molecular dynamics simulations where many thousands of time steps must be taken sequentially, solving the eigenproblem at each step. Trying to decrease the time to solution by parallelizing the matrix-vector multiplication on such small Hamiltonian matrices is like trying to split up a single floating point operation over multiple processor cores, and communication overhead quickly dominates.

We propose to further develop the block subspace creation algorithm by putting the application of each filter operator on a separate set of processors, so that the cost of filtering multiple starting vectors has a wall clock time of filtering only one vector. This development would increase the scalability of the code in any chemical scenario where anywhere from one hundred to many thousands of valence electrons are present. Moreover, the applicability of this parallelization scheme in the smaller system regime would provide a means to decrease time to solution in important computationally intensive scenarios that involve smaller Hamiltonians such as long molecular dynamics runs and in computations used to bootstrap GW calculations for excited states where a large number of eigenvectors associated with non-valence electrons need to be calculated, but for which the systems are usually small.

PARSEC is a well developed, heavily parallelized electronic structure code with three layers of parallelization built into it. It does all of the computations for spin up/down, multiple k-points, and multiple mirror images of symmetric systems simultaneously across independent groups of processors. It parallelizes matrix-vector multiplication by distributing vectors across processors. It also breaks the spectrum into slices and computes each slice across independent groups of processors with windowing filter operators. Adding another layer of parallelization is a substantial development effort whose testing and debugging will require substantial computing resources given the nature of the feature we are proposing to add. PARSEC is freely available to the scientific community and investment in development of the code will meet the NISE program's objectives by improving scalability of the code and making it easier for scientists to do their work on computationally intensive calculations This research would allow our electronic structure code to scale to a number of processors at least an order of magnitude larger than what is currently possible. The goal is to improve scaling of PARSEC for MD simulations, where currently many thousands of time steps must be taken sequentially.over a wide range of system sizes.

Phonon, thermodynamic and diffusion properties of lithium-ion batteries from first-principles calculations

ShunLi Shang, Pennsylvania State University

Associated NERSC Project: First-principles investigations of fundamental properties for Ni-base alloys (m679)

NISE Award: 500,000 Hours
Award Date: June 2011

The fundamental properties of thermodynamics and diffusion are crucial for the development of Li-ion batteries, e.g., (i) the voltage of a battery relates to the chemical potentials of Li-ion in cathode and anode, (ii) the Li-ion electrode discharging-charging process is enabled by Li and electron diffusion into (out of) the active cathode particles; and (iii) they are the basic inputs for other simulations such as phase-field simulations. However, most of these fundamental properties are still lacking for Li-ion battery materials due to the daunting amount of experimental works. In the present proposal, an integrated first-principles calculations and computational thermodynamics (i.e., the CALPHAD modeling) approach will be used to predict the critically needed thermodynamic and diffusion properties of Li-ion battery materials in the Li-Mn-Fe-Co-Ni-O system, with the main focus being the high-energy layered LixNiyMn1-2yCoyO2.

In order to gain the thermodynamic and diffusion properties as a function of composition and temperature for Li-ion batteries by CALPHAD modeling, the key fundamental properties are needed, herein, from the first-principles phonon calculations due to the dearth of experimental data. The complex structures, the dipole-dipole interactions, and the partially disordered nature of the Li-ion battery materials offer a number of challenges (barriers) for the first-principles phonon calculations, but they are tractable as shown in the next paragraph.

In the present proposal, first-principles based phonon calculations will be used to predict the temperature-dependent thermodynamic and diffusion properties. Using our newly developed Yphon approach for phonon [J. Phys. Cond. Mat. 22, 2010, 202201], the dipole-dipole interactions can be treated accurately for the polarized materials (e.g., the Li-ion oxides). Using first-principles transition state theory with inputs of the migration energy and phonon properties, the diffusion coefficients for both diffusion barrier and pre-factor can be predicted as demonstrated firstly by us for fcc Al [Phys. Rev. Lett. 100, 2008, 215901]. For the partially disordered phases (e.g., LixNiyMn1-2yCoyO2), the degree of randomness needs to be considered. Here, we propose to use the special quasirandom structures (SQS) approach as well as the partition function method developed recently by us [Phys. Rev. B 82, 2010, 014425].

When the project is completed successfully, both the computational approach (first-principles phonon and CALPHAD) for the complex Li-ion battery materials and the thermodynamic and diffusion databases of the Li-Mn-Fe-Co-Ni-O system will be delivered. These databases can be used, e.g., (i) to predict the phase stability of Li-ion battery materials under different process conditions, (ii) to find optimizing route to synthesize Li-ion batteries with good performances, (iii) as input for phase-field simulations.

CACHE novel architecture research

Brian Van Straalen, Lawrence Berkeley National Lab

Associated NERSC Project: Algorithms and Software for Communication Avoidance and Communication Hiding at the Extreme Scale (m1270)
Principal Investigator: Erich Strohmaier, Lawrence Berkeley National Lab

NISE Award: 470,000 Hours
Award Date: June 2011

We plan to conduct experiments on scalability, parallelization strategies (such as OpenMP, MPI, GPU, and their hybrids), genera, and architecture-specific code optimization techniques. Another benefit NERSC brings to this research is the ability to test the portability of autotuning techniques among HPC architectures such Intel based architectures and Cray Opteron based architectures. We estimate 120,000 core hours for our experimental study: evaluation of the strengths and limitations of autotuning = 15,000 core hours; experimental analysis on multiple problem classes on different architectures = 30,000 core hours; empirical analysis for new approaches = 30,000 core hours; tests on parallel autotuning approaches = 25,000 core hours; studies on autotuning GPU codes = 20,000 hours.

Edgar Solomonik and Jim Demmel and Brian Van Straalen are looking to explore a novel approach to topology-aware multicasts and multicore threaded reduction kernels to make dramatic reductions in communication contention for dense linear algebra and 2.5 D algorithms on Cray systems. We've done similar work already on BG/P systems. This will take at least 10,000 node hours, but since you get charged by the core hour we need about 200,000 additional hours. Some of the preliminary work will be done on Franklin with the Portals interface, but the XE6 DMAPP interface will also be worked on. This addresses all three elements of the NISE call.

Sam Williams and Noel Keen and Brian Van Straalen are working on fine-grained threading of stencil kernels and automatic code transformations and communication hiding schemes on multicore architectures. All three were proposed, but it looks like we have to put all three together to make any of them work at the finest point-wise parallelism scales. This work will probably take a similar amount as was used under the ultrascale development and the APDEC SciDAC work, roughly 150,000 core hours. This involves hybrid programming, threading, code transformation and novel approaches to multicore development and is part of moving Chombo to exascale production computing.

High performance simulation of reactive transport processes at the pore scale

Brian Van Straalen, Lawrence Berkeley National Lab

Associated NERSC Project: APDEC: Applied Differential Equations Center (m1055)
Principal Investigator: Phillip Colella, Lawrence Berkeley National Lab

NISE Award: 1,200,000 Hours
Award Date: June 2011

We have combined high performance CFD capabilities in Chombo--incompressible flow plus advection-diffusion in complex geometry--with the complex geochemistry package in CrunchFlow to model reactive transport processes in resolved pore space. This work is being funded by a recent ARRA supplement to the SciDAC project, APDEC, upon which our current allocation, MP111, is based. In this new work we are partnering with the Energy Frontier Research Center for Nanoscale Control of Geologic Carbon at LBL to model emergent processes in geologic subsurface porous medium where the pore space is resolved. The goal is to achieve mesh spacing resolved down to 1 micron, which is the resolution of the image data obtained from validating experiments. For a 1 cm long capillary tube tightly packed with spheres and an aspect ratio of approximately 18.5:1 we have been able to achieve 1024x64x64, which yields just under 10 micron resolution. These runs have been performed on a 96 core cluster. In order to attain 1 micron resolution our problem size will increase by 8-fold. As intermediate targets we will attempt 2048x128x128 with 1024 processors on Franklin, and 4096x256x256 with 8192 processors on Franklin. Ultimately, we will attempt to run the finest resolution of 8192x512x512 with over 65,000 processors on Hopper. In each of these runs we are seeking steady-state solutions, first by running the CFD velocity code to steady-state and then advancing the transport to steady-state while holding the velocity fixed. This is permissible since we are not changing the geometry of the pore space yet as a result of dissolution or precipitation.

Also, we have found that the transport code completes 3x more timesteps per minute than the velocity code. The velocity can take on the order of 1000 timesteps to achieve steady-state; the transport usually takes another 1000-2000 steps depending on the process--whether dissolution or precipitation--and the rate constants. Furthermore, there are algorithmic issues that will result from higher resolution of these tightly packed complex geometries that we will have to work out during the scaling process for our elliptic solvers.

This work is important to the modeling needs of the Center for Nanoscale Control of Geologic Carbon, one the DOE Energy Frontier Research Centers, which is focused on modeling the fate of super critical CO2 injected into the earth as it pertains to carbon sequestration. The ability to model reactive transport processes at the pore scale will allow for better parameterizations of field scale models which can in turn predict the fate of CO2 in the subsurface over the long periods of time relevant to carbon sequestration.

First-Principles Calculations of Multiferroic Oxides

Yi Wang, Pennsylvania State University

Associated NERSC Project: Spin-lattice Coupling in Magnetic Phase Transition (m891)

NISE Award: 1,2500,000 Hours
Award Date: March 2011

As both the magnetization and the polarization can be manipulated by electric fields, magnetic fields, or elastic deformations, multiferroics are of importance for many applications, such as data storage, spintronics, and microelectronic devices. However, even though many published experimental and theoretical works, the complete theoretical framework of multiferroic materials are still not clear.

The main objective of the proposed research is to explore the fundamental nature of multiferroic materials using first-principles approach. Multiferroics are those materials that in a single phase simultaneously exhibit more than one primary ferroic order parameter, typically, ferromagnetism, ferroelectricity, and ferroelasticity. The factors that we plan to consider in our formulation of free energy are:

  1. The continuous phase transition among multi dipolar states. In this connection, the magnetocaloric effect is an analogous, but better-known and understood, phenomenon. We can extend our recent works on the thermodynamics of magnetic system.
  2. The effects of the long range dipole–dipole interactions on the vibrational contribution to the free energy. As we will use the direct approach to calculate the phonon properties, we need to accurately handle the long range dipole–dipole interactions for the general purpose of calculating phonon density-of-states.
  3. The effects of an external field on the free energy.
  4. The possible imaginary phonon issue for certain dipolar states. We plan to solve the problem using our recently developed procedure which can calculate the phonon frequencies of a state with high symmetry, using the force constants calculated from a structure with relatively lower symmetry.
  5. Thin film materials. The strain effects will be considered by replacing the volume with the lattice vectors in the expression of the free energy formulated in the recent work.

High resolution integration of CAM5 for extreme weather research

Michael Wehner, Lawrence Berkeley National Lab

Associated NERSC Project: Program for Climate Model Diagnosis and Intercomparison (mp193)
Principal Investigator: James Boyle, Lawrence Livermore National Lab

NISE Award: 5,6000,000 Hours
Award Date: June 2011

We propose to perform an AMIP simulation of the latest version NCAR Community Atmospheric Model (CAM5.1) at very high resolution. We have recently completed a 1 year simulation on hopper of this model at a horizontal resolution of approximately 25km. Our plans for this version of the code are extensive and later proposals for computer resources will be much larger. With this request, we want to perform a simulation of the recent past (1979-2009). Following the AMIP protocol, the sea surface temperature and sea ice concentration are specified from observations as lower boundary conditions to the atmosphere. The purpose of this simulation will be characterize the model's performance in simulating the recent past prior to making projections about future climate change. We will diagnose the model against available atmospheric observations. These will include but are not limited to surface air temperature, precipitation, mean sea level pressure, cloudiness and radiative balance. Our principal interest in high resolution models is the simulation of extreme weather events, especially precipitation, and how they may change as the climate warms. Our previous work has focused on heat waves, droughts, floods and most immediate to this project, tropical cyclones and hurricanes. This simulation will be benchmarked against observational measures of all these events. In addition, we have new capabilities in CRD to analyze atmospheric rivers. Sometimes called the "Pineapple Express", these huge excursions of tropical moisture into the extratropics are the source of major warm winter storms to the US west coast. Our preliminary analysis reveals a surprising amount of realism in the simulation of these events and the model will enable much greater understanding of how these patterns form than do the limited observations.

There is great internal and external interest in this simulation. Although our results are only one week old, we already have solicitations of data from LBNL's Earth Systems Division, NCAR, LLNL, and Princeton. We view the model output data as a public resource and will share it freely. We anticipate that we will distribute the data to the public as part of the CMIP5 and the Earth System Grid (hosted at NERSC). It is hard to estimate the number of papers that will come out of this simulation (and the later future projection simulations), but it will be many. We are confident that these simulations will be of high impact and policy relevant.

This simulation is a necessary step in increasing confidence in the projection of future changes in extreme weather. We will gain insight into hurricanes, severe storms and heat waves. The high fidelity of the simulation will aid policymakers in decisions on a much more local scale than is possible from previous climate model simulations.

Modeling and Simulation of High Dimensional Stochastic Multiscale PDE Systems at the Exascale

Nicholas Zabaras, Cornell University

Associated NERSC Project: Modeling and Simulation of High Dimensional Stochastic Multiscale PDE Systems at the Exascale (m1182)

NISE Award: 1500,000 Hours
Award Date: June 2011

Predictive modeling of multiscale and multiphysics systems requires accurate data-driven characterization of the input uncertainties and understanding how they propagate across scales and alter the final solution. We address three major current limitations in modeling stochastic systems: (1) Integration of multiscale models with stochastic analysis poses unique mathematical and computational challenges, and (2) The dimensionality of the stochastic space is very high, (3) The stochastic inputs are rarely derived from experimental data and the underlying physics. The integration of the multiscale and stochastic nature of the problems of interest is addressed with the development of stochastic coarse graining methods. The curse of dimensionality in the solution of the high-dimensional stochastic multiscale PDE systems under consideration is addressed with the construction of low-complexity surrogate models. Input uncertainties are modeled by data-driven multiscale adaptive nonlinear low-order models. We are currently at the dawn of a new stochastic simulation era, and computationally tractable low-complexity surrogate models will play a significant role in the future in harnessing and controlling complex stochastic systems. Our integrated methodology involves concepts from manifold learning, high dimensional model representations (HDMR) in stochastic space, sparse grid stochastic collocation, spectral and hybrid approaches, stochastic multiscale mathematics (stochastic homogenization theory), non-linear model reduction of SPDEs, scalable multigrid methods, Bayesian multiscale estimation and scalable parallel algorithms. The integrated methodology will be based on a scalable multilevel parallelism approach (which incorporates both task and data parallelism) and will be implemented using the Global Arrays (GA) programming model. We will exploit the GA hierarchical processor group concept to extend the scalability to larger number of processors on current petascale and emerging exascale architectures, which allows for tractable collocation-based approaches. The proposed computational thinking can be applied to many areas as uncertainty and model reduction cut across all disciplines in physical and biological sciences, from climate modeling to systems biology.

We propose to demonstrate the new approach in two important class of problems: transport processes in random geological media and predicting the response of materials with random microstructures. In particular, two critical applications of importance to the mission of the DOE will be considered: (a) CO2 geological sequestration with emphasis on mineral dissolution/precipitation and pore clogging under CO2 injection and (b) predictive modeling of void/gas bubble evolution, volume swelling, and performance of irradiated materials of interest to the next generation nuclear reactor design.

Theoretical Investigation on the Properties of Energy Materials

Qianfan Zhang, Stanford University

Associated NERSC Project: Searching and designing new materials for topological insulators (m1131), Principal Investigator: Shoucheng Zhang

NISE Award: 400,000 Hours
Award Date: March 2011

Search for novel energy material has become an important goal in material sciences. Recently, silicon nanowires (SiNWs), as well as other Si nanostructures, including crystalline-amorphous core−shell SiNWs, carbon-amorphous Si core−shell NWs, Si nanotubes, porous Si particles and an ordered macroporous carbon-Si composite have been demonstrated as ultrahigh capacity lithium ion battery negative electrodes; this opens up exciting opportunities for energy storage devices. However, experimental investigations of these Si nanostructures have mainly been focused on electrochemical cycling of Si−Li compounds and phase transitions during Li insertion. Although the Si−Li compound can be accurately analyzed, the fundamental interaction between Li and Si atoms and the microscopic dynamic process during Li insertion still remain unknown. Based on our former work, we will continue the theretical study of Li insertion in different kinds of Si nanostructures, including SiNWs, Si nano-shell… Our study will be very important in revealing the nature of quantum confinement effects and the fundamental mechanics of Li insertion and diffusion, and at the same time, give suggestion on promising routes for future electrochemical applications.

We propose a program to theoretically investigate the properties of energy materials, like Si nanowires, Si nano-particles and Si nano-shell, by DFT (density functional theory) computational method, by designing new theoretical model, and by working closely with the experimental colleagues to measure its novel properties.

Study Li insertion and diffusion in different kinds of Si nano-structures

We already have a systematic study of the single Li insertion in SiNWs with different diameters along different axis orientations within the DFT framework. It is the first step for further computational simulation on Li insertion in SiNWs, as well as other Si nanostructures. We will continue our study on this project, and investigate the fundamental mechanism of Li-Si interaction in SiNWs.

Experimental investigations have not paid much attention to Li insertion behavior in SiNWs with different orientations and different sizes, especially when the Li doping ratio is very low and the SiNW is ultrathin. Although some theoretical works have focused on the Li-SiNW interaction, the effect of anisotropy on the interaction of Li with SiNWs of different orientations still remains unclear. Our previous work also neglects the detailed discussion on this aspect. We will study systematically the relationship between Li insertion behavior and SiNW orientation. We will perform first-principle simulations on [110], [100], [111] and [112] SiNWs with single Li impurity, and calculate the electronic structure in order to study the impact of quantum confinement and the reason for different Li insertion behavior in different orientated SiNWs.

Other Si nano-structures, like Si nanotube or Si nano-shell, are also good candidates for Li battery anode. We want to study Li insertion and diffusion behavior in various Si nano-sructures using the method that is similar with SiNW study, and calculate Li binding energy and diffusion barrier in these materials to see the distinct Li insertion behavior between different nano-structures. The other important aspect is the size effect of nano-structure, because as the size decrease, the surface Si atoms weight much and quantum confinement makes great effort. We expect the Li insertion and diffusion behavior can be quite different in the structures with small size.

Strain effect and doping effect in various Si nano-structures

One of the most important anisotropic effects in confined system relates to the response to external strain. Imposing strain is a common way to change the properties of material and improve device performance. SiNW strain studies, both experimental and theoretical, have shown remarkable changes in SiNW properties, like great enhancement of carrier mobility and significant modification of band structure, especially when the diameter is small. Recently, a theoretical study indicated that a strained SiNW can open up a new avenue for application in solar cells. Therefore, it can be predicted that the strain can also affect the Li insertion in SiNWs for energy storage application. Furthermore, strain exists naturally in real SiNWs and its occurrence is almost unavoidable during the grow process. Therefore, the theoretical simulation of Li insertion in strained SiNWs is also helpful to assist in experimental understanding. Our ab initio simulation will also focus on Li insertion and diffusion in SiNW under strain effect. For SiNWs, the strain can be applied to either axis direction or radial direction, and it is predicted that the influence can be quite different; On the other hand, different orientated SiNWs have different response to strain. We will study these effects in details.

Strain effect in other Si nano-structures is also interesting. It can be predicted that the impact of strain is dependent on the size and configuration of these low-dimension system. We will do ab initio simulation on nano-shell, nanotube or other nano-particles, to see which structure is ideal for strain effect application.

N or P doping is another important way to change the properties of semiconductor materials. Some previous work theoretically studied N or P doping in SiNWs and investigated its effect on electronic band structure. We suggest to study Li insertion in N or P doped SiNWs to see whether such doping can affect Li behavior in SiNWs. Our plan is to substitute one Si atom in SiNWs by doping atom B, N or P and see the Li insertion and diffusion behavior near the doping atom.

Show Pagination