NERSCPowering Scientific Discovery for 50 Years

Publications & Presentations - Archive

Prabhat

2016

Evan Racah, Seyoon Ko, Peter Sadowski, Wahid Bhimji, Craig Tull, Sang-Yun Oh, Pierre Baldi, Prabhat, "Revealing Fundamental Physics from the Daya Bay Neutrino Experiment using Deep Neural Networks", ICMLA, 2016,

Debbie Bard, Wahid Bhimji, David Paul, Glenn K Lockwood, Nicholas J Wright, Katie Antypas, Prabhat Prabhat, Steve Farrell, Andrey Ovsyannikov, Melissa Romanus, others, "Experiences with the Burst Buffer at NERSC", Supercomputing Conference, November 16, 2016, LBNL LBNL-1007120,

Michael Ringenburg, Shuxia Zhang, Kristyn
Maschhoff, Bill Sparks, Evan Racah, Prabhat,
"Characterizing the Performance of Analytics Workloads on the Cray XC40", Cray User Group, May 13, 2016,

Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon,
Alex Gittens, Lisa Gerhardt, Suren Byna, Mike F. Ringenburg, Prabhat,
"H5Spark: Bridging the I/O Gap between Spark and Scientific Data Formats on HPC Systems", Cray User Group, May 13, 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

Mostofa Patwary, Nadathur Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey, "PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures", IPDPS 2016, April 5, 2016,

Jesse Livezey, Gopala Anumanchipalli, Brian Cheung, Prabhat, Michael DeWeese, Edward Chang, Kristofer Bouchard, "Deep networks reveal the structure of motor control in sensorimotor cortex during speech production", CoSyne 2016, March 1, 2016,

Alex Gittens, Jey Kottalam, Jiyan Yang, Michael F Ringenburg, Jatin Chhugani, Evan Racah, Mohitdeep Singh, Yushu Yao, Curt Fischer, Oliver Ruebel, Benjamin Bowen, Norman Lewis, Michael W Mahoney, Venkat Krishnamurthy, Prabhat, "A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark", The 5th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics, IPDPS, February 1, 2016,

Alex Gittens, Nick Cavanaugh, Karthik Kashinath, Travis O’Brien, Prabhat, Michael Mahoney, "Large-scale Parallelized EOF Computation on the CSFR Ocean Temperature Field", American Meteorological Society 2015, January 12, 2016,

2015

Andy Miller, Albert Wu, Jeff Regier, Ryan Adams, Jon McAuliffe, Dustin Lang, David Schlegel, Prabhat, "A stochastic process model of quasar spectral energy distribution", NIPS 2015, December 15, 2015,

Jesse Livezey, Gopala Anumanchipalli, Brian Cheung, Prabhat, Fritz Sommer, Michael DeWeese, Kris Bouchard, Edward Chang, "Classifying spoken syllables from human sensorimotor cortex with deep networks", NIPS 2015 Workshop, December 15, 2015,

Jeff Regier, Jon McAuliffe, Prabhat, "A deep generative model for astronomical images of galaxies", NIPS 2015 Workshop, December 15, 2015,

Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md. Mostofa Ali Patwary, Prabhat, Ryan Adams, "Scalable Bayesian Optimization using Deep Neural Networks", Deep Learning Symposium at NIPS 2015, December 15, 2015,

Michael Wehner, Kevin Reed, Prabhat, Daithi Stone, "Inconsistencies in future changes in tropical cyclone statistics between CMIP5-class models and state of the art high resolution atmospheric models", American Geophysical Union Meeting 2015, December 9, 2015,

Prabhat, Daithi Stone, Xiaolan Wang, Michael Wehner, William D. Collins, "Detection and Attribution of Extra-Tropical Cyclone activity in CMIP-5", American Geophysical Union Meeting 2015, December 8, 2015,

Prabhat, Yunjie Liu, Evan Racah, Joaquin Correa, Amir Khosrowshahi, David Lavers, Kenneth Kunkel, Michael Wehner, William D. Collins, "Deep Learning for Climate Pattern Detection", American Geophysical Union Meeting 2015, December 8, 2015,

Yunjie Liu, Karthik Kashinath, Prabhat, Travis O’Brien, "Systematic Characterization of Cyclogenesis in High Resolution Climate Model Simulations", American Geophysical Union Meeting 2015, December 8, 2015,

"Deep Learning for Science", Prabhat, Kris Bouchard, Wahid Bhimji, Evan Racah, NERSC Science Highlight, December 8, 2015,

Prabhat, BD-CATS: Big Data Clustering at Trillion Particle Scale, Intel HPC Developers Conference, November 15, 2015,

"Big Science Problems, Big Data Solutions", Prabhat, O'Reilly Media article, November 10, 2015,

Prabhat, Top 10 Problems in Scientific Big Data Analytics, Invited talk at AMPLab, UC Berkeley, November 10, 2015,

"Tackling a Trillion", Prabhat, Suren Byna, Mostofa Patwary, ASCR Discovery Research Highlight, November 10, 2015,

Mostafa Patwary, Suren Byna, Nadathur Satish, Narayanan Sundaram, Zarija Lukic, Vadim Roytershteyn, Yushu Yao, Prabhat, Pradeep Dubey, "BD-CATS: Big Data Clustering at Trillion Particle Scale", SC 2015, November 3, 2015,

Shane Snyder, Phil Carns, Rob Latham, Rob Ross, Misbah Mubarak, Christopher Carothers, Babak Behzad, Huong Luu, Suren Byna, Prabhat, "Techniques for Modeling Large Scale HPC I/O workloads", SC PMBS workshop 2015, November 3, 2015,

Babak Behzad, Suren Byna, Prabhat, Marc Snir, "Pattern Driven Parallel I/O Tuning", SC 2015 PDSW Workshop, November 3, 2015,

Sean Mackesey, Prabhat, Gyorgi Buzsaki, Amir Khosrowshahi, Fritz Sommer, "A high performance computing web service for local field potential analysis", SfN Neuroscience 2015, November 3, 2015,

Soyoung Jeon, Prabhat, Suren Byna, Bill Collins, Michael Wehner, "Characterization of Extreme Precipitation within Atmospheric River Events over California", Journal of Advances in Statistical Climatology, Meteorology and Oceanography, November 1, 2015,

Prabhat, Scientific Big Data: Challenges and Opportunities, Invited Talk at IIT Mumbai, October 20, 2015,

Prabhat, Scientific Big Data: Challenges and Opportunities, Invited Talk at UC Berkeley, Department of Statistics, October 15, 2015,

Prabhat, Scientific Big Data: Challenges and Opportunities, Invited Talk at Oxford University, October 10, 2015,

Kris Bouchard, Jesse Livezey, Alex Bujan, Prabhat, Sharmodeep Bhattacharyya, Fritz Sommer, Peter Denes, Eddie Chang, "Scalable Analytic Methods for Data Driven Discovery in Neuroscience", MSRI workshop on Neural Computation, October 8, 2015,

Prabhat, Kris Bouchard, Annette Greiner, Oliver Ruebel, Peter Denes, Alex Bujan, Sean Mackesey, Jesse Livezey, Jeff Teeters, Fritz Sommer, Eddie Chang, "Supporting Experimental Neuroscience @ NERSC", MSRI workshop on Neural Computation, October 7, 2015,

Richard Grotjahn, Robert Black, Ruby Leung, Michael F. Wehner, Mathew Barlow, Mike Bosilovich, Sasha Gershunov, William Gutowski, John Gyakum, Richard W. Katz, Arun Kumar, Yun-Young Lee, Young-Kwon Lim, Christopher J. Paciorek, Prabhat, "North American Extreme Temperature Events and Related Large Scale Meteorological Patterns: Statistical Tools, Dynamics, Modeling, and Trends", Journal of Climate 2015, October 1, 2015,

Jiyan Yang, Jey Kottalam, Mohit Singh, Oliver Ruebel, Curt Fischer, Ben Bowen, Michael Mahoney, Prabhat, "Implementing Randomized Matrix Algorithms on Spark", XLDB 2015, October 1, 2015,

Babak Behzad, Suren Byna, Stefan Wild, Prabhat, Marc Snir, "Dynamic Model-driven Parallel I/O Performance Tuning", IEEE CLUSTER 2015, October 1, 2015,

"Celeste: A New Model for Cataloging the Universe", Prabhat, Jeff Regier, Jon McAuliffe, Ryan Adams, David Schlegel, NERSC Science Highlight, September 9, 2015,

Prabhat, Surendra Byna, Venkatram Vishwanath, Eli Dart, Michael Wehner, William D. Collins, "TECA: Petascale Pattern Recognition for Climate Science", CAIP 2015, August 25, 2015,

Babak Behzad, Suren Byna, Stefan Wild, Prabhat and Marc Snir, "Dynamic Model-driven Parallel I/O Performance Tuning", IEEE Cluster 2015, August 1, 2015,

"‘Data Deluge’ Pushes Mass Spec Imaging to New Heights", Prabhat, Ben Bowen, Oliver Ruebel, Michael Mahoney, NERSC Science Highlight, July 15, 2015,

Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md. Mostofa Ali Patwary, Prabhat, Ryan Adams, "Scalable Bayesian Optimization using Deep Neural Networks", ICML 2015, July 7, 2015,

Huong Luu, Marianne Winslett, William Gropp, Kevin Harms, Phil Carns, Robert Ross, Yushu Yao, Suren Byna, Prabhat, "A Multi-platform Study of I/O Behavior on Petascale Supercomputers", HPDC 2015, June 9, 2015,

Michael Wehner, Prabhat, Kevin A. Reed, Daithi Stone, William D. Collins, Julio Bacmeister, "Resolution dependence of future tropical cyclone projections of CAM5.1 in the US CLIVAR Hurricane Working Group idealized configurations", J. Climate special issue on CliVAR 2015, May 5, 2015,

Prabhat, Scientific Big Data: Challenges and Opportunities, Invited Talk at Brown University, April 28, 2015,

Jiyan Yang, Oliver Ruebel, Prabhat, Michael Mahoney, Ben Bowen, "Identifying important ions and positions in mass spectroscopy imaging data using CUR matrix decompositions", Analytical Chemistry, March 31, 2015,

"BigNeuron", Prabhat, Kris Bouchard, Shreyas Cholia, Annette Greiner, NERSC Science Highlight, March 31, 2015,

Prabhat, Data and Analytics Strategy, February 23, 2015,

Chris Paciorek, Benjamin Lipschitz, Tina Zhuo, Cari Kaufman, Prabhat, Rollin Thomas, Parallelized Gaussian Processes in R, Joint Statistical Meeting 2014, January 21, 2015,

Prabhat, Deep Learning for Big Data, DOE ASCR Workshop on Machine Learning, January 9, 2015,

Michael Berry, Tom Potok, Prasanna Balaprakash, Hank Hoffman, Raju Vatsavi, Prabhat, "Machine Learning and Understanding for Intelligent Extreme Scale Scientific Computing and Discovery", DOE ASCR Workshop Report, January 7, 2015,

2014

"A Standard for Neuroscience Data", Oliver Ruebel, Kris Bouchard, Prabhat, NERSC Science Highlight, December 16, 2014,

Jeffrey Regier, Brenton Partridge, Jon McAuliffe, Ryan Adams, Matt Hoffman, Dustin Lang, David Schlegel, Prabhat, "Celeste: Scalable variational inference for a generative model of astronomical images", NIPS 2014 Workshop on Advances in Variational Inference, December 9, 2014,

Soyoung Jeon, Chris Paciorek, Prabhat, Suren Byna, William Collins, Michael Wehner, "Uncertainty Quantification for characterizing spatial tail dependence under statistical framework", American Geophysical Union Meeting 2014, December 9, 2014,

Michael Wehner, Prabhat, Fuyu Li, William Collins, "The effect of horizontal resolution of the simulation of precipitation extremes in the Community Atmospheric Model version 5.1", American Geophysical Union Meeting 2014, December 9, 2014,

Michael F. Wehner, Kevin Reed, Fuyu Li, Prabhat, Julio Bacmeister, Cheng-Ta Chen, Surenda Byna, Chris Paciorek, Peter Gleckler, Ken Sperber, William D. Collins, Andrew Gettelman, Kesheng Wu, Christiane Jablonowski, Chris Algieri, "The effect of horizontal resolution on AMIP simulation quality in the Community Atmospheric Model, CAM5.1", JAMES, December 9, 2014,

Quincey Koziol, Ruth Aydt, Russ Rew, Mark Howison, Mark Miller, Prabhat, "HDF5", Book Chapter in High Performance Parallel I/O, Prabhat, Quincey editors, CRC Press., ( October 23, 2014)

Suren Byna, Prabhat, Homa Karimabadi, Bill Daughton, "Parallel I/O for a Trillion Particle Plasma Physics Simulation", Book Chapter in Prabhat, Quincey Koziol, editors, High Performance Parallel I/O, CRC Press/Francis-Taylor Group. 2014, ( October 23, 2014)

Mark Howison, Suren Byna, Prabhat, "Iota", Book Chapter in Prabhat, Quincey Koziol, editors, High Performance Parallel I/O, CRC Press/Francis-Taylor Group. 2014, ( October 23, 2014)

M. Scot Breitenfeld, Kalyana Chadalavada, Robert Sisneros, Suren Byna, Quincey Koziol, Neil Fortner, Prabhat, Venkat Vishwanath, "Tuning Performance of Large scale I/O with Parallel HDF5", SC’14 PDSW Workshop, October 15, 2014,

Rusen Oktem, Prabhat, James Lee, Aaron Thomas, Paquita Zuidema, David Romps, "Stereo Photogrammetry of oceanic clouds", Journal of Atmospheric and Oceanic Technology, July 24, 2014,

2013

Babak Behzad, Huong Luu, Joey Huchette, Suren Byna, Prabhat, Ruth Aydt, Quincey Koziol, Marc Snir, "Taming Parallel I/O Complexity with Auto-Tuning", SuperComputing 2013, October 9, 2013,

Chris Paciorek, Benjamin Lipschitz, Tina Zhuo, Cari Kaufman, Prabhat, Rollin Thomas, "Parallelized Gaussian Processes in R", Journal of Statistical Software, May 23, 2013,

E. Wes Bethel, Prabhat, Suren Byna, Oliver R\ ubel, K. John Wu, Michael Wehner, "Why High Performance Visual Data Anaytics is Both Relevant and Difficult", Visualization and Data Analysis, IS\&T/SPIE Electronic Imaging 2013, San Francisco, CA, USA, 2013,

Babak Behzad, Joseph Huchette, Huong Luu, Suren Byna, Yushu Yao, Prabhat, "A Framework for Auto-tuning HDF5 Applications", HPDC, 2013,

Suren Byna, Andrew Uselton, Prabhat, David Knaak, Helen He, "Trillion Particles, 120,000 cores, and 350 TBs: Lessons Learned from a Hero I/O Run on Hopper", Cray User Group Meeting, Best Paper Award., 2013,

Wei-Chen Chen, George Ostrouchov, Dave Pugmire, Prabhat, Michael Wehner, "Exploring multivariate relationships in Large Spatial Data with Parallel Model-Based Clustering and Scalable Graphics", Technometrics, 2013,

Daithi Stone, Chris Paciorek, Prabhat, Pardeep Pall, Michael Wehner, "Inferring the anthropogenic contribution to local temperature extremes", PNAS, 2013, 110 (7),

Dean N. Williams, Timo Bremer, Charles Doutriaux, John Patchett, Galen Shipman, Blake Haugen, Ross Miller, Brian Smith, Chad Steed, E. Wes Bethel, Hank Childs, Harinarayan Krishnan, Prabhat, Michael Wehner, Claudio T. Silva, Emanuele Santos, David Koop, Tommy Ellqvist, Huy T. Vo, Jorge Poco, Berk Geveci, Aashish Chaudhary, Andrew Bauer, Alexander Pletzer, Dave Kindig, Gerald L. Potter, Thomas P. Maxwell, "The Ultra-scale Visualization Climate Data Analysis Tools: Data Analysis and Visualization for Geoscience Data", IEEE Special Issue: Cutting-Edge Research in Visualization, 2013,

Prabhat, William D. Collins, Michael Wehner, "Extreme-Scale Climate Analytics", Google Regional PhD Summit, 2013,

Prabhat, William D. Collins, Michael Wehner, Suren Byna, Chris Paciorek, "Big Data Challenges in Climate Science", Berkeley Atmospheric Science Symposium, 2013,

Prabhat, Pattern Detection for Large Climate Datasets, Climate 2013, 2013,

Prabhat, Data Formats, Data Models and Parallel I/O, CRD Division Review, 2013,

2012

E. Wes Bethel, David Camp, Hank Childs, Mark Howison, Hari Krishnan, Burlen Loring, Joerg Meyer, Prabhat, Oliver Ruebel, Daniela Ushizima, Gunther Weber, "Towards Exascale: High Performance Visualization and Analytics – Project Status Report. Technical Report", DOE Exascale Research Conference, April 1, 2012, LBNL 5767E,

Erin LeDell, Prabhat, Dmitry Yu Zubarev, Brian Austin, Jr. William A. Lester, "Classification of Nodal Pockets in Many-Electron Wave Functions via Machine Learning", Journal of Mathematical Chemistry, January 1, 2012, 50:2043,

Allen R. Sanderson, Brad Whitlock, Oliver R\ ubel, Hank Childs, Gunther H. Weber, Prabhat, Kesheng Wu, "A System for Query Based Analysis and Visualization", Third International EuroVis Workshop on Visual Analytics EuroVA 2012, Vienna, Austria, 2012, 25--29, LBNL 5507E, doi: 10.2312/PE/EuroVAST/EuroVA12/025-029

Hank Childs, David Pugmire, Sean Ahern, Brad Whitlock, Howison, Prabhat, Gunther H. Weber, E. Wes Bethel, "Visualization at Extreme-Scale Concurrency", High Performance Visualization: Enabling Extreme-Scale Scientific Insight, (CRC Press: 2012) Pages: 291--306

Surendra Byna, Jerry Chou, Oliver R\ ubel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, Kesheng Wu, "Parallel I/O, analysis, and visualization of a trillion particle simulation", SC 12, Los Alamitos, CA, USA, IEEE Computer Society Press, 2012, 59:1--59:1,

E. Wes Bethel, Surendra Byna, Jerry Chou, Estelle Cormier-Michel, Cameron G. R. Geddes, Mark Howison, Fuyu Li, Prabhat, Ji Qiang, Oliver R\ ubel, Robert D. Ryne, Michael Wehner, Kesheng Wu, "Big Data Analysis and Visualization: What Do LINACS and Tropical Storms Have In Common?", 11th International Computational Accelerator Physics Conference, ICAP 2012, Rostock-Warnem\ unde, Germany, 2012,

Babak Behzad, Joseph Huchette, Huong Luu, Suren Byna, Yushu Yao, Prabhat, "Auto-Tuning of Parallel I/O Parameters for HDF5 Applications", SC, 2012,

Surendra Byna, Jerry Chou, Oliver R\ ubel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, Kesheng Wu, "Parallel Data, Analysis, and Visualization of a Trillion Particles", XLDB, 2012,

E. Wes Bethel, Rob Ross, Wei-Ken Liao, Prabhat, Karen Schuchardt, Peer-Timo Bremer, Oliver R\ ubel, Surendra Byna, Kesheng Wu, Fuyu Li, Michael Wehner, John Patchett, Han-Wei Shen, David Pugmire, Dean Williams, "Recent Advances in Visual Data Exploration and Analysis of Climate Data", SciDAC 3 Principal Investigator Meeting, 2012,

Oliver Ruebel, Cameron Geddes, Min Chen, Estelle Cormier, Ji Qiang, Rob Ryne, Jean-Luc Vey, David Grote, Jerry Chou, Kesheng Wu, Mark Howison, Prabhat, Brian Austin, Arie Shoshani, E. Wes Bethel, "Scalable Data Management, Analysis and Visualization of Particle Accelerator Simulation Data", SciDAC 3 Principal Investigator Meeting, 2012,

Prabhat, Suren Byna, Kesheng Wu, Jerry Chou, Mark Howison, Joey Huchette, Wes Bethel, Quincey Koziol, Mohammad Chaarawi, Ruth Aydt, Babak Behzad, Huong Luu, Karen Schuchardt, Bruce Palmer, "Updates from the ExaHDF5 project: Trillion particle run, Auto-Tuning and the Virtual Object Layer", DOE Exascale Research Conference, 2012,

E. Wes Bethel, David Camp, Hank Childs, Mark Howison, Hari Krishnan, Burlen Loring, J\ org Meyer, Prabhat, Oliver R\ ubel, Daniela Ushizima, Gunther Weber, "Towards Exascale: High Performance Visualization and Analytics -- Project Status Report", 2012,

Oliver Rubel, Wes Bethel, Prabhat, Kesheng Wu, "Query Driven Visualization and Analysis", High Performance Visualization: Enabling Extreme-Scale Scientific Insight, (CRC Press: 2012)

Prabhat, 13TB, 80,000 cores and TECA: The search for extreme events in climate datasets, American Geophysical Union Meeting, 2012,

Michael Wehner, Surendra Byna, Prabhat, Thomas Yopes, John Wu, "Atmospheric Rivers in the CMIP3/5 Historical and Projection Simulations", World Climate Research Programme (WCRP) Workshop on CMIP5 Model Analysis, 2012,

Mehmet Balman, Eric Pouyoul, Yushu Yao, Loring E. Wes Bethel, Prabhat, John Shalf, Alex Sim, Brian L. Tierney, "Experiences with 100G Network Applications", Proceedings of the 5th International Workshop on Data Intensive and Distributed Computing (DIDC 2012), Delft, Netherlands, 2012,

Prabhat, Oliver R\ ubel, Surendra Byna, Kesheng Wu, Fuyu Li, Michael Wehner, E. Wes Bethel, "TECA: A Parallel Toolkit for Extreme Climate Analysis", Third Worskhop on Data Mining in Earth System Science (DMESS 2012) at the International Conference on Computational Science (ICCS 2012), Omaha, Nebraska, 2012,

Michael F. Wehner, Prabhat, Surendra Byna, Fuyu Li, Erin LeDell, Thomas Yopes, Gunther Weber, Wes Bethel, and William D. Collins, TECA, 13TB, 80,000 Processors: Or: Characterizing Extreme Weather in a Changing Climate, Second Workshop on Understanding Climate Change from Data, 2012,

Michael Wehner, Kevin Reed, Prabhat, Surendra Byna, William D. Collins, Fuyu Liand Travis O Brien, Julio Bacmeister, Andrew Gettelman, High Resolution CAM5.1 Simulations, 14th International Specialist Meeting on the Next Generation Models of Climate and Sustainability for Advanced High Performance Computing Facilities, 2012,

Wei-Chen Chen, George Ostrouchov, David Pugmire, Prabhat, Michael Wehner, Model Based Clustering Analysis of Large Climate Simulation Datasets, Joint Statistical Meeting, 2012,

2011

R. Ryne, B. Austin, J. Byrd, J. Corlett, E. Esarey, G. R. Geddes, W. Leemans, Prabhat X. Li, J. Qiang, R\ ubel, J.-L. Vay, M. Venturini, K. Wu, B. Carlsten, D. Higdon, N. Yampolsky, "High Performance Computing in Accelerator Science: Past Successes, Future Challenges", ASCR/BES Workshop on Data and Communications in Basic Energy Sciences: Creating a Pathway for Scientific Discovery, 2011,

Michael Philpott, Prabhat, Yoshiyuki Kawazoe, "Magnetism and Bonding in Graphene Nanodots with H modified Edge and Apex", J. Chem Physics, 2011, 135,

Richard Martin, Prabhat, David Donofrio, Maciek Haranczyk, "Accelerating Analysis of void spaces in porous materials on multicore and GPU platforms", International Journal of High Performance Computing Applications, 2011, doi: 10.1177/1094342011431591

Surendra Byna, Prabhat, Michael F. Wehner, Kesheng John Wu, "Detecting atmospheric rivers in large climate datasets", PDAC 11, New York, NY, USA, ACM, 2011, 7--14, doi: 10.1145/2110205.2110208

Wei Zhuo, Prabhat, Chris Paciorek, Cari Kaufman, "Distributed Kriging Analysis for Large Spatial Data", ICDM 11, 2011,

Prabhat, Suren Byna, Chris Paciorek, Gunther Weber, Kesheng Wu, Thomas Yopes, Michael Wehner, William Collins, George Ostrouchov, Richard Strelitz, E. Wes Bethel, "Pattern Detection and Extreme Value Analysis on Large Climate Data", DOE/BER Climate and Earth System Modeling PI Meeting, 2011,

Chris Paciorek, Michael Wehner, Prabhat, "Computationally-efficient Spatial Analysis of Precipitation Extremes Using Local Likelihood", Statistical and Applied Mathematical Sciences Uncertainty Quantification program, Climate Modeling Opening Workshop, 2011,

Prabhat, "ExaHDF5: An I/O Platform for Exascale Data Models, Analysis and Performance", DOE/ASCR Exascale kickoff meeting, 2011,

Jerry Chou, Kesheng Wu, Prabhat, "FastQuery: A Parallel Indexing System for Scientific Data", CLUSTER 11, Washington, DC, USA, IEEE Computer Society, 2011, 455--464, doi: 10.1109/CLUSTER.2011.86

Jerry Chou, Kesheng Wu, Prabhat, "FastQuery: A general Index and Query system for scientific data", Scientific and Statistical Database Management Conference, 2011,

Richard Martin, Prabhat, Maciek Haranczyk, James Sethian, "PDE-based analysis of void space of porous materials on multicore CPUs", Manycore and Accelerator-based High-performance Scientific Computing, 2011,

Richard Martin, Thomas Willems, Chris Rycroft, Prabhat, Michael Kazi, Maciek Haranczyk, "High Throughput structure analysis and descriptor generation for crystalline porous materialsi", International Conference on Chemical Structures, 2011,

Prabhat, Scientific Visualization 101, LBL Open House, 2011,

Prabhat, Pattern Detection and Extreme Value Analysis for Climate Data, American Geophysical Union Meeting, 2011,

Prabhat, Visualization and Analysis of Global Cloud Resolving Models, Geophysical Fluid Dynamics Lab, 2011,

2010

Oliver Rübel, Sean Ahern, E. Wes Bethel, Mark D. Biggin, Hank Childs, Estelle Cormier-Michel, Angela DePace, Michael B. Eisen, Charless C. Fowlkes, Cameron G. R. Geddes, Hans Hagen, Bernd Hamann, Min-Yu Huang, Soile V. E. Keränen, David W. Knowles, Cris L. Luengo Hendriks, Jitendra Malik, Jeremy Meredith, Peter Messmer, Prabhat, Daniela Ushizima, Gunther H. Weber, Kesheng Wu, "Coupling Visualization and Data Analysis for Knowledge Discovery from Multi-Dimensional Scientific Data", Proceedings of International Conference on Computational Science, ICCS 2010, May 2010,

      

H. Childs, D. Pugmire, S. Ahern, B. Whitlock, M. Howison, Prabhat, G. Weber, E. W. Bethel, "Extreme Scaling of Production Visualization Software on Diverse Architectures", Computer Graphics and Applications, May 2010, 30 (3):22 - 31,

Daniela Ushizima, Cameron Geddes, Estelle Cormier-Michel, E. Wes Bethel, Janet Jacobsen, Prabhat, Oliver Rübel, Gunther Weber, Bernard Hamann, Peter Messmer, and Hans Hagen, "Automated Detection and Analysis of Particle Beams in Laser-Plasma Accelerator Simulations", In-Tech, ( 2010) Pages: 367 - 389

Mark Howison, Andreas Adelmann, E. Wes Bethel, Achim Gsell, Benedikt Oswald, Prabhat, "H5hut: A High-Performance I/O Library for Particle-Based Simulations", Proceedings of 2010 Workshop on Interfaces and Abstractions for Scientific Data Storage (IASDS10), Heraklion, Crete, Greece, January 1, 2010,

G. H. Weber, S. Ahern, E. W. Bethel, S. Borovikov, H. R. Childs, E. Deines, C. Garth, H. Hagen, B. Hamann, K. I. Joy, D. Martin, J. Meredith, Prabhat, D. Pugmire, O. Rübel, B. Van Straalen, and K. Wu, "Recent Advances in VisIt: AMR Streamlines and Query-Driven Visualization", Numerical Modeling of Space Plasma Flows: Astronum-2009 (Astronomical Society of the Pacific Conference Series), 2010, 429:329–334,

Tina Zhuo, Prabhat, Chris Paciorek, Cari Kaufman, "Distributed Likelihoods Computation for Large Spatial Data", SuperComputing, 2010,

Prabhat, Scientific Visualization, LBL Open House, 2010,

Prabhat, Visualization and Analysis of Climate data, LBL Earth Sciences Division Seminar, 2010,

2009

Oliver R\ ubel, Cameron G. R. Geddes, Estelle Cormier-Michel Kesheng Wu, Prabhat, Gunther H. Weber, Daniela M. Ushizima, Peter Messmer, Hans Hagen, Bernd Hamann, Wes Bethel, "Automatic Beam Path Analysis of Laser Wakefield Particle Acceleration Data", IOP Computational Science \& Discovery, ( 2009)

E. W. Bethel, C. Johnson, S. Ahern, J. Bell, P.-T. Bremer, H. Childs, E. Cormier-Michel, M. Day, E. Deines, T. Fogal, C. Garth, C. G. R. Geddes, H. Hagen, B. Hamann, C. Hansen, J. Jacobsen, K. Joy, J. Kr\ uger, J. Meredith, P. Messmer, G. Ostrouchov, V. Pascucci, K. Potter, Prabhat, D. Pugmire, O. R\ ubel, A. Sanderson, C. Silva, D. Ushizima, G. Weber, B. Whitlock, K. Wu., "Occam s Razor and Petascale Visual Data Analysis", Journal of Physics Conference Series, Proceedings of SciDAC 2009, 2009, 180:012084,

2008

Oliver R\ ubel, Prabhat, Kesheng Wu, Hank Childs, Jeremy Meredith, Cameron G. R. Geddes, Estelle Cormier-Michel, Sean Ahern, Gunther H. weber, Peter Messmer, Hans Hagen, Bernd Hamann, E. Wes Bethel, "High Performance Multivariate Visual Data Exploration for Extemely Large Data", SuperComputing 2008 (SC08), Austin, Texas, USA, 2008,

Matthew Andrews

2014

Clayton Bagwell, Allison Andrews, Automated Provisioning and Management of NGF via NIM, September 10, 2014,

2011

J. Hick, M. Andrews, Leveraging the Business Value of Tape, FujiFilm Executive IT Summit 2011, June 9, 2011,

Describes how tape is used in the HPSS Archive and HPSS Backup systems at NERSC.  Includes some examples of our organizations tape policies, our roadmap to Exascale and an example of tape in the Exascale Era, our observed tape reliability, and an overview of our locally developed Parallel Incremental Backup System (PIBS) which performs backups of our NGF file system.

2010

Neal Master, Matthew Andrews, Jason Hick, Shane Canon, Nicholas J. Wright, "Performance Analysis of Commodity and Enterprise Class Flash Devices", Petascale Data Storage Workshop (PDSW), November 2010,

Katie Antypas

2016

Debbie Bard, Wahid Bhimji, David Paul, Glenn K Lockwood, Nicholas J Wright, Katie Antypas, Prabhat Prabhat, Steve Farrell, Andrey Ovsyannikov, Melissa Romanus, others, "Experiences with the Burst Buffer at NERSC", Supercomputing Conference, November 16, 2016, LBNL LBNL-1007120,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Salman Habib, Robert Roser (HEP Leads), Richard Gerber, Katie Antypas, Katherine Riley, Tim Williams, Jack Wells, Tjerk Straatsma (ASCR Leads), A. Almgren, J. Amundson, S. Bailey, D. Bard, K. Bloom, B. Bockelman, A. Borgland, J. Borrill, R. Boughezal, R. Brower, B. Cowan, H. Finkel, N. Frontiere, S. Fuess, L. Ge, N. Gnedin, S. Gottlieb, O. Gutsche, T. Han, K. Heitmann, S. Hoeche, K. Ko, O. Kononenko, T. LeCompte, Z. Li, Z. Lukic, W. Mori, P. Nugent, C.-K. Ng, G. Oleynik, B. O'Shea, N. Padmanabhan, D. Petravick, F.J. Petriello, J. Power, J. Qiang, L. Reina, T.J. Rizzo, R. Ryne, M. Schram, P. Spentzouris, D. Toussaint, J.-L. Vay, B. Viren, F. Wurthwein, L. Xiao, "ASCR/HEP Exascale Requirements Review Report", arXiv:1603.09303 [physics.comp-ph], March 31, 2016,

2015

Richard A. Gerber, Katie Antypas, Sudip Dosanjh, Jack Deslippe, Nick Wright, Jay Srinivasan, Systems Roadmap and Plans for Supporting Extreme Data Science, December 10, 2015,

Yun (Helen) He, Alice Koniges, Richard Gerber, Katie Antypas, Using OpenMP at NERSC, OpenMPCon 2015, invited talk, September 30, 2015,

N.J. Wright, S. S. Dosanjh, A. K. Andrews, K. Antypas, B. Draney, R.S Canon, S. Cholia, C.S. Daley, K. M. Fagnan, R.A. Gerber, L. Gerhardt, L. Pezzaglia, Prabhat, K.H. Schafer, J. Srinivasan, "Cori: A Pre-Exascale Computer for Big Data and HPC Applications", Big Data and High Performance Computing 26 (2015): 82., ( June 2015) doi: 10.3233/978-1-61499-583-8-82

Extreme data science is becoming increasingly important at the U.S. Department of Energy's National Energy Research Scientific Computing Center (NERSC). Many petabytes of data are transferred from experimental facilities to NERSC each year. Applications of importance include high-energy physics, materials science, genomics, and climate modeling, with an increasing emphasis on large-scale simulations and data analysis. In response to the emerging data-intensive workloads of its users, NERSC made a number of critical design choices to enhance the usability of its pre-exascale supercomputer, Cori, which is scheduled to be delivered in 2016. These data enhancements include a data partition, a layer of NVRAM for accelerating I/O, user defined images and a customizable gateway for accelerating connections to remote experimental facilities.

Katie Antypas, The Cori System, February 24, 2015,

2014

K. Antypas, B.A Austin, T.L. Butler, R.A. Gerber, C.L Whitney, N.J. Wright, W. Yang, Z Zhao, "NERSC Workload Analysis on Hopper", Report, October 17, 2014, LBNL 6804E,

A Dubey, K Antypas, AC Calder, C Daley, B Fryxell, JB Gallagher, DQ Lamb, D Lee, K Olson, LB Reid, P Rich, PM Ricker, KM Riley, R Rosner, A Siegel, NT Taylor, K Weide, FX Timmes, N Vladimirova, J ZuHone, "Evolution of FLASH, a multi-physics scientific simulation code for high-performance computing", The International Journal of High Performance Computing Applications, May 2014, 28:225--237, doi: 10.1177/1094342013505656

2013

Zhengji Zhao, Katie Antypas, Nicholas J Wright, "Effects of Hyper-Threading on the NERSC workload on Edison", 2013 Cray User Group Meeting, May 9, 2013,

Katie Antypas, Best Practices for Reading and Writing Data on HPC Systems, NUG Meeting 2013, February 14, 2013,

Katie Antypas, NERSC-8 Project, NUG Meeting, February 12, 2013,

NERSC-8 Project Overview

A Dubey, K Antypas, A Calder, B Fryxell, D Lamb, P Ricker, L Reid, K Riley, R Rosner, A Siegel, F Timmes, N Vladimirova, K Weide, "The Software Development Process of FLASH, a Multiphysics Simulation Code", 2013, 1--8,

2012

Zhengji Zhao, Mike Davis, Katie Antypas, Yushu Yao, Rei Lee and Tina Butler, "Shared Library Performance on Hopper", A paper presented in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,

Zhengji Zhao, Mike Davis, Katie Antypas, Yushu Yao, Rei Lee and Tina Butler, Shared Library Performance on Hopper, A talk in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,

Zhengji Zhao, Yun (Helen) He and Katie Antypas, "Cray Cluster Compatibility Mode on Hopper", A paper presented in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, Germany., May 1, 2012,

Zhengji Zhao, Yun (Helen) He and Katie Antypas, Cray Cluster Compatibility Mode on Hopper, A talk in the Cray User Group meeting, April 29-May-3, 2012, Stuttgart, German., May 1, 2012,

Yun (Helen) He and Katie Antypas, "Running Large Jobs on a Cray XE6 System", Cray User Group 2012 Meeting, Stuttgart, Germany, April 30, 2012,

A.C. Uselton, K.B. Antypas, D. Ushizima, J. Sukharev, "File System Monitoring as a Window into User I/O Requirements", CUG Proceedings, Edinburgh, Scotland, March 1, 2012,

2011

K. Antypas, Parallel I/O From a User's Perspective, HPC Advisory Council, December 6, 2011,

Zhengji Zhao, Mike Davis, Katie Antypas, Rei Lee and Tina Butler, Shared Library Performance on Hopper, Oct. 26, 2011, Cray Quarterly Meeting at St Paul, MN, October 26, 2011,

Yun (Helen) He and Katie Antypas, Mysterious Error Messages on Hopper, NERSC/Cray Quarterly Meeting, July 25, 2011,

Katie Antypas, Tina Butler, Jonathan Carter, "The Hopper System: How the Largest XE6 in the World went from Requirements to Reality", Cray User Group Proceedings, May 31, 2011,

K. Antypas, Y. He, "Transitioning Users from the Franklin XT4 System to the Hopper XE6 System", Cray User Group 2011 Procceedings, Fairbanks, Alaska, May 2011,

The Hopper XE6 system, NERSC’s first peta-flop system with over 153,000 cores has increased the computing hours available to the Department of Energy’s Office of Science users by more than a factor of 4. As NERSC users transition from the Franklin XT4 system with 4 cores per node to the Hopper XE6 system with 24 cores per node, they have had to adapt to a lower amount of memory per core and on- node I/O performance which does not scale up linearly with the number of cores per node. This paper will discuss Hopper’s usage during the “early user period” and examine the practical implications of running on a system with 24 cores per node, exploring advanced aprun and memory affinity options for typical NERSC applications as well as strategies to improve I/O performance.

Katie Antypas, Yun (Helen) He*, Transitioning Users from the Franklin XT4 System to the Hopper XE6 System, Cray User Group 2011, Fairbanks, AK, May 2011,

K. Antypas, The Hopper XE6 System: Delivering High End Computing to the Nation’s Science and Research Community, Cray Quarterly Review, April 1, 2011,

2010

K. Antypas, Introduction to Parallel I/O, ASTROSIM 2010 Workshop, July 19, 2010,

A. Uselton, K. Antypas, D. M. Ushizima, J. Sukharev, "File System Monitoring as a Window into User I/O Requirements", Proceedings of the 2010 Cray User Group Meeting, Edinburgh, Scotland, Edinburgh, Scotland, May 24, 2010,

2009

K. Antypas, NERSC: Delivering High End Scientific Computing to the Nation's Research Community, November 5, 2009,

A. Dubey, K. Antypas, M.K. Ganapathy, L.B. Reid, K.M. Riley, D. Sheeler, A. Siegel, K. Weide, "Extensible Component Based Architecture for FLASH: A Massively Parallel, Multiphysics Simulation Code", Parallel Computing, July 1, 2009, 35 (10-1:512-522,

K. Antypas and A. Uselton, "MPI-I/O on Franklin XT4 System at NERSC", CUG Proceedings, Atlanta, CA, May 28, 2009,

A Dubey, K Antypas, MK Ganapathy, LB Reid, K Riley, D Sheeler, A Siegel, K Weide, "Extensible component-based architecture for FLASH, a massively parallel, multiphysics simulation code", Parallel Computing, 2009, 35:512--522, doi: 10.1016/j.parco.2009.08.001

2008

H. Shan, K. Antypas, J.Shalf., "Characterizing and Predicting the I/O Performance of HPC Applications Using a Parameterized Synthetic Benchmark.", Supercomputing, Reno, NV, November 17, 2008,

Antypas, K., Shalf, J., Wasserman, H., "NERSC‐6 Workload Analysis and Benchmark Selection Process", LBNL Technical Report, August 13, 2008, LBNL 1014E,

Science drivers for NERSC-6

R. Fisher, S. Abarzhi, K. Antypas, S. M. Asida, A. C. Calder, F. Cattaneo, P. Constantin, A. Dubey, I. Foster, J. B. Gallagher, M. K. Ganapathy, C.C. Glendenin, L. Kadano, D.Q. Lamb, S. Needham, M. Papka, T. Plewa, L.B. Reid, P. Rich, K. Riley, and D. Sheeler., "Tera-scale Turbulence Computation on BG/L Using the FLASH3 Code", IBM Journal of Research and Development., March 1, 2008, Vol 52:127-136,

K. Antypas, J. Shalf, H. Wasserman, "NERSC-6 Workload Analysis and Benchmark Selection Process", January 1, 2008,

John Shalf, Honzhang Shan, Katie Antypas, I/O Requirements for HPC Applications, talk, January 1, 2008,

J. Shalf, K. Antypas, H.J. Wasserman, Recent Workload Characterization Activities at NERSC, Santa Fe Workshop, January 1, 2008,

2006

A. C. Calder, N. T. Taylor, K. Antypas, and D. Sheeler, "A Case Study of Verifying and Validating an Astrophysical Simulation Code", Astronomical Society of the Pacific, March 26, 2006, 119,

K.B. Antypas, A. C. Calder, A. Dubey, J. B. Gallagher, J. Joshi, D. Q. Lamb, T. Linde, E. Lusk, O. E. B. Messer, A. Mignone, H. Pan, M. Papka, F. Peng, T. Plewa, P. M. Ricker, K. Riley, D. Sheeler, A. Siegel, N. Taylor, J. W. Truran, N. Vladimirova, G. Weirs, D. Yu, Z. Zhang., "FLASH: Applications and Future.", Parallel Computational Fluid Dynamics 2005: Theory and Applications, edited by A. Deane, G. Brenner, A. Ecer, D. R. Emerson, j. McDonough, J. Periaux, N. Satofuka, D. Tromeur-Dervout., January 1, 2006, 325,

2005

KB Antypas, AC Calder, A Dubey, JB Gallagher, J Joshi, DQ Lamb, T Linde, EL Lusk, OEB Messer, A Mignone, H Pan, M Papka, F Peng, T Plewa, KM Riley, PM Ricker, D Sheeler, A Siegel, N Taylor, JW Truran, N Vladimirova, G Weirs, D Yu, J Zhang, "Parallel Computational Fluid Dynamics 2005", Parallel Computational Fluid Dynamics 2005, ( 2005) Pages: 325--331

Brian Austin

2015

Brian Austin, Hardware Trends and Challenges the for Computational Chemistry, Pacifichem, December 18, 2015,

Brian Austin, Eric Roman, Xiaoye Sherry Li, "Resilient Matrix Multiplication of Hierarchical Semi-Separable Matrices", Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, Portland, OR, June 15, 2015,

Suren Byna, Brian Austin, "Evaluation of Parallel I/O Performance and Energy with Frequency Scaling on Cray XC30", Cray User Group Meeting, April 2015,

Jack Deslippe, Brian Austin, Chris Daley, Woo-Sun Yang, "Lessons learned from optimizing science kernels for Intel's "Knights-Corner" architecture", CISE, April 1, 2015,

Brian Austin, Alex Druinsky, Xiaoye Sherry Li, Osni, A. Marques, Eric Roman, Incorporating Error Detection and Recovery Into Hierarchically Semi-Separable Matrix Operations, SIAM CSE 15, March 17, 2015,

2014

Alex Druinsky, Brian Austin, Xiaoye Sherry Li, Osni Marques, Eric Roman, Samuel Williams, "A Roofline Performance Analysis of an Algebraic Multigrid PDE Solver", SC14, November 2014,

Brian Austin, Nicholas Wright, "Measurement and interpretation of microbenchmark and application energy use on the Cray XC30", Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, November 2014,

K. Antypas, B.A Austin, T.L. Butler, R.A. Gerber, C.L Whitney, N.J. Wright, W. Yang, Z Zhao, "NERSC Workload Analysis on Hopper", Report, October 17, 2014, LBNL 6804E,

2013

Hongzhang Shan, Brian Austin, Wibe De Jong, Leonid Oliker, Nicholas Wright, Edoardo Apra, "Performance Tuning of Fock Matrix and Two-Electron Integral Calculations for NWChem on Leading HPC Platforms", SC'13, November 11, 2013,

Brian Austin, Matthew Cordery, Harvey Wasserman, Nicholas J. Wright, "Performance Measurements of the NERSC Cray Cascade System", 2013 Cray User Group Meeting, May 9, 2013,

Brian Austin, NERSC, "Characterization of the Cray Aries Network", May 6, 2013,

2012

Hongzhang Shan, Brian Austin, Nicholas Wright, Erich Strohmaier, John Shalf, Katherine Yelick, "Accelerating Applications at Scale Using One-Sided Communication", The 6th Conference on Partitioned Global Address Programming Models, Santa Barbara, CA, October 10, 2012,

M. Reinsch, B. Austin, J. Corlett, L. Doolittle, P. Emma, G. Penn, D. Prosnitz, J. Qiang, A. Sessler, M. Venturini, J. Wurtele, "Machine Parameter Studies for and FEL Facility using STAFF", Proceedings of IPAC2012, New Orleans, Louisiana, USA, May 20, 2012, 1768,

D.Y. Zubarev, B.M. Austin, W.A. Lester Jr, "Practical Aspects of Quantum Monte Carlo for the Electronic Structure of Molecules", Practical Aspects of Computational Chemistry I: An Overview of the Last Two Decades and Current Trends, January 1, 2012, 255,

D.Y. Zubarev, B.M. Austin, W.A. Lester Jr, "Quantum Monte Carlo for the x-ray absorption spectrum of pyrrole at the nitrogen K-edge", The Journal of chemical physics, January 1, 2012, 136:144301,

Erin LeDell, Prabhat, Dmitry Yu Zubarev, Brian Austin, Jr. William A. Lester, "Classification of Nodal Pockets in Many-Electron Wave Functions via Machine Learning", Journal of Mathematical Chemistry, January 1, 2012, 50:2043,

Oliver Ruebel, Cameron Geddes, Min Chen, Estelle Cormier, Ji Qiang, Rob Ryne, Jean-Luc Vey, David Grote, Jerry Chou, Kesheng Wu, Mark Howison, Prabhat, Brian Austin, Arie Shoshani, E. Wes Bethel, "Scalable Data Management, Analysis and Visualization of Particle Accelerator Simulation Data", SciDAC 3 Principal Investigator Meeting, 2012,

2011

J.N. Corlett, B. Austin, K.M. Baptiste, J.M. Byrd, P. Denes, R. Donahue, L. Doolittle, R.W. Falcone, D. Filippetto, S. Fournier, D. Li, H.A. Padmore, C. Papadopoulos, C. Pappas, G. Penn, M. Placidi, S. Prestemon, D. Prosnitz, J. Qiang, A. Ratti, M. Reinsch, F. Sannibale, R. Schlueter, R.W. Schoenlein, J.W. Staples, T. Vecchione, M. Venturini, R. Wells, R. Wilcox, J. Wurtele, A. Charman, E. Kur, A.A. Zholents, "A Next Generation Light Source Facility at LBNL", PAC 11 Conference Proceedings, January 1, 2011,

Brian Austin, Ji Qiang, Jonathan Wurtele, Alice Koniges, "Influences of architecture and threading on the MPI communication strategies in an accelerator simulation code.", SciDAC 2011, Denver, CO, 2011,

Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang E. Wes Bethel, Arie Shoshani, Oliver R\ ubel, Prabhat, Rob D. Ryne, "Parallel Index and Query for Large Scale Data Analysis", SC 11, Seattle, WA, USA, January 1, 2011, 30:1--30:1, doi: http://doi.acm.org/10.1145/2063384.2063424

B.M. Austin, D.Y. Zubarev, WA Lester, "Quantum Monte Carlo and Related Approaches.", Chemical Reviews, January 1, 2011,

Matthias Reinsch, Brian Austin, John Corlett, Lawrence Doolittle, Gregory Penn, Donald Prosnitz, Ji Qiang, Andrew Sessler, Marco Menturini, Jonathan Wurtele, "System Trade Analysis for an FEL Facility", Free Electron Laser Conference FEL 2011, Shanghai, China, January 1, 2011,

2010

Jinhua Wang, Dominik Domin, Brian Austin, Dmitry Yu, Jarrod McClean, Michael Frenklach, Tian Cui, Jr. Lester, "A Diffusion Monte Carlo Study of the O-H Bond Dissociation of Phenol", J. Phys. Chem. A, January 1, 2010, 114:9832,

William A. Lester Brian Austin, "Fixed-Node Correlation Function Diffusion Monte Carlo: an approach to Fermi excited states", Bulletin of the American Physical Society, January 1, 2010,

Naoto Umezawa, Brian Austin, "Self-interaction-free nonlocal correlation energy functional associated with a Jastrow function", Bulletin of the American Physical Society, January 1, 2010, 55,

2009

Naoto Umezawa, Brian Austin, Jr William A. Lester, Effective one-body potential fitted for many-body interactions associated with a Jastrow function: application to the quantum Monte Carlo calculations, Bulletin of the American Physical Society, January 1, 2009,

2006

B. Austin, A. Aspuru-Guzik, R. Salomon-Ferrer, Jr. W.A. Lester, "Linear-Scaling Evaluation of the Local Energy in Quantum Monte Carlo", Advances in Quantum Monte Carlo, American Chemical Society, January 1, 2006,

Alex Sodt, Greg J. O. Beran, Yousung Jung, Brian Austin, Martin Head-Gordon, "A Fast Implementation of Perfect Pairing and Imperfect Pairing Using the Resolution of the Identity Approximation", Journal of Chemical Theory and Computation, January 1, 2006, 2:300-305,

2005

A. Aspuru--Guzik, R. Salom\ on--Ferrer, B. Austin, R. Perusqu\ \ia--Flores, M.A. Griffin, R.A. Oliva, D. Skinner, D. Domin, W.A. Lester Jr, "Zori 1.0: A parallel quantum Monte Carlo electronic structure package", Journal of Computational Chemistry, January 1, 2005, 26:856--862,

A. Aspuru-Guzik, R. Salomon-Ferrer, B. Austin, Jr. Lester, "A sparse algorithm for the evaluation of the local energy in quantum Monte Carlo", J. Comp. Chem., January 1, 2005, 26:708,

Gregory J. O. Beran, Brian Austin, Alex Sodt, Martin Head-Gordon, "Unrestricted Perfect Pairing: The Simplest Wave-Function-Based Model Chemistry beyond Mean Field", The Journal of Physical Chemistry A, January 1, 2005, 109:9183,

Clayton L. Bagwell, Jr.

2016

Clayton Bagwell, Richard Gerber, NUG 2016 Business Meeting: Allocations, NUG Business Meeting presentation, March 24, 2016,

NUG (NERSC Users Group) Business meeting: Allocations

Clayton Bagwell, NUG 2016 New User Training - Accounts & Allocations, NUG New User Training presentation, March 21, 2016,

NUG (NERSC Users Group) New User Training; Accounts and Allocations

Clayton Bagwell, Richard Gerber, NERSC Brown Bag: Allocations, NERSC Brown Bag presentation, March 17, 2016,

Brown Bag presentation to NERSC staff on how Allocations work and the new scavenger queues.

2015

Clayton L. Bagwell, Jr., How to Submit a 2016 ERCAP Request, August 21, 2015,

Clayton Bagwell, Allocating Fixed Hours to Users Instead of Repo Percentage in NIM, 2015 NUG (NERSC Users Group) Presentation, May 14, 2015,

2014

Clayton Bagwell, Allison Andrews, Automated Provisioning and Management of NGF via NIM, September 10, 2014,

2013

Clayton Bagwell, How to Submit a 2014 ERCAP Request, September 16, 2013,

Video presentation and accompanying PowerPoint slides on "How to Submit a 2014 ERCAP Request".

2012

Clayton Bagwell, How to Submit an ERCAP Request, September 6, 2012,

Jan Balewski

2014

Nicholas Balthaser

2016

Lisa Gerhardt, Jeff Porter, Nick Balthaser, Lessons Learned from Running an HPSS Globus Endpoint, 2016 HPSS User Forum, September 1, 2016,

The NERSC division of LBNL has been running HPSS in production since 1998. The archive is quite popular with roughly 100TB IO every day from the ~6000 scientists that use the NERSC facility. We maintain a Globus-HPSS endpoint that transfers over 1PB / month of data into and out of HPSS. Getting Globus and HPSS to mesh well can be challenging. This talk gives an overview of some of the lessons learned.

2014

Nicholas Balthaser, Lisa Gerhardt, NERSC Archival Storage: Best Practices, Joint Facilities User Forum on Data-Intensive Computing, June 18, 2014,

Nick Balthaser, NERSC; Lisa Gerhardt, NERSC, Introduction to NERSC Archival Storage: HPSS, February 3, 2014,

2013

N. Balthaser, GlobusOnline/HPSS Live Demo, HUF 2013, November 5, 2013,

Live demonstration using the GlobusOnline data transfer software to store files to the NERSC archive for 2013 HPSS Users Forum meeting.

N. Balthaser, LBNL/NERSC Site Report: HPSS in Production, HUF 2013, November 5, 2013,

Overview of HPSS infrastructure and practices at LBNL/NERSC for 2013 HPSS Users Forum meeting.

N. Balthaser, W. Hurlbert, T10KC Technology in Production, May 9, 2013,

Report to 2012  Large Tape User Group meeting regarding our production statistics and experiences using the Oracle T10000C tape drive.

2012

N. Balthaser, J. Hick, W. Hurlbert, StorageTek Tape Analytics: Pre-Release Evaluation at LBNL, LTUG 2012, April 25, 2012,

A report to the Large Tape Users Group (LTUG) annual conference on a pre-release evaluation of the new software product, StorageTek Tape Analytics (STA).  We provide a user's perspective on what we found useful, some suggestions for improvement, and some key new features that would enhance the product.

2011

N. Balthaser, D. Hazen, "HSI Best Practices for NERSC Users", May 2, 2011, LBNL 4745E,

 

In this paper we explain how to obtain and install HSI, create a NERSC authentication token, and transfer data to and from the system. Additionally we describe methods to optimize data transfers and avoid common pitfalls that can degrade data transfers and storage system performance.

 

Debbie Bard

2016

Debbie Bard, Wahid Bhimji, David Paul, Glenn K Lockwood, Nicholas J Wright, Katie Antypas, Prabhat Prabhat, Steve Farrell, Andrey Ovsyannikov, Melissa Romanus, others, "Experiences with the Burst Buffer at NERSC", Supercomputing Conference, November 16, 2016, LBNL LBNL-1007120,

Deborah Bard, Burst Buffers: Early Experiences and Outlook, Supercomputing 2016, November 14, 2016,

The long-awaited Burst Buffer technology is now being deployed on major supercomputing systems, including new machines at NERSC, LANL, ANL, and KAUST. In this BOF, we discuss early experience with Burst Buffers from both a systems and a user’s perspective, including challenges faced and perspectives for future development. Short presentations from early adopters will be followed by general discussion with the audience. We hope that this BOF will attract interest and participation from end-users and software/hardware developers. 

See www.burstbuffer.org for presentations. 

Debbie Bard, Using Containers and HPC to Solve the Mysteries of the Universe, DockerCon 2016, June 27, 2016,

Container technology is being used to answer some of the biggest questions in science today - what is the Universe made of? How has it evolved over time? Scientists use vast quantities of data to study these questions, and analyzing this data requires Big Data solutions on high performance computing resources. In this talk we discuss why containers are being deployed on the Cori supercomputer at NERSC (the National Energy Research Scientific Computing center) to answer fundamental scientific questions. We will give examples of the use of Docker in simulating complex physical processes and analyzing experimental data in fields as diverse as particle physics, cosmology, astronomy, genomics and material science. We will demonstrate how container technology is being used to facilitate access to scientific computing resources by scientists from around the globe. Finally, we will discuss how container technology has the potential to revolutionize scientific publishing, and could solve the problem of scientific reproducibility.

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

Debbie Bard, Accelerating Science with the NERSC Burst Buffer Early User Program, Salishan Conference on High-Speed Computing, April 28, 2016,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 750 different projects spanning a wide variety of scientific applications, including climate modeling, combustion, fusion, astrophysics, computational biology, and many more. The potential applications of the Burst Buffer at NERSC are therefore also considerable and diverse.

I will discuss the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its different capabilities to enable new scientific advancements. I will present details of the program, in-depth performance results and lessons-learnt from highlighted projects.

Salman Habib, Robert Roser (HEP Leads), Richard Gerber, Katie Antypas, Katherine Riley, Tim Williams, Jack Wells, Tjerk Straatsma (ASCR Leads), A. Almgren, J. Amundson, S. Bailey, D. Bard, K. Bloom, B. Bockelman, A. Borgland, J. Borrill, R. Boughezal, R. Brower, B. Cowan, H. Finkel, N. Frontiere, S. Fuess, L. Ge, N. Gnedin, S. Gottlieb, O. Gutsche, T. Han, K. Heitmann, S. Hoeche, K. Ko, O. Kononenko, T. LeCompte, Z. Li, Z. Lukic, W. Mori, P. Nugent, C.-K. Ng, G. Oleynik, B. O'Shea, N. Padmanabhan, D. Petravick, F.J. Petriello, J. Power, J. Qiang, L. Reina, T.J. Rizzo, R. Ryne, M. Schram, P. Spentzouris, D. Toussaint, J.-L. Vay, B. Viren, F. Wurthwein, L. Xiao, "ASCR/HEP Exascale Requirements Review Report", arXiv:1603.09303 [physics.comp-ph], March 31, 2016,

Ankit Bhagatwala

2015

A, Bhagatwala, R. Sankaran, S. Kokjohn, J.H. Chen, "Numerical investigation of spontaneous flame propagation under RCCI conditions", Combustion and Flame, 2015,

A. Bhagatwala, Z. Luo, H. Shen, J.A. Sutton, T. Lu, J.H. Chen, "Numerical and experimental investigation of turbulent DME jet flames", Proceedings of the Combustion Institute, 2015,

A. Krisman, E.R. Hawkes, M. Talei, A. Bhagatwala, J.H. Chen, "Polybrachial structures in dimethyl ether edge-flames at NTC conditions", Proceedings of the Combustion Institute, 2015,

2014

A. Bhagatwala, J.H. Chen, T. Lu, "Direct numerical simulations of HCCI/SACI with ethanol", Combustion and Flame, 2014,

2012

A. Bhagatwala, S.K. Lele, "Interaction of a converging spherical shock wave with isotropic turbulence", Physics of Fluids, 2012,

2011

A. Bhagatwala, S.K. Lele, "Interaction of a Taylor blast wave with isotropic turbulence", Physics of Fluids, 2011,

2010

E. Johnsen, J. Larsson, A. Bhagatwala, W.H. Cabot, P. Moin, B.J. Olson, P. Rawat, S.K. Shankar,
B. Sjogreen, H. Yee, X. Zhong, S.K. Lele,
"Assessment of high resolution methods for numerical simulations of compressible turbulence with shock waves", Journal of Computational Physics, 2010,

2009

A. Bhagatwala, S.K. Lele, "A modified artificial viscosity approach for compressible turbulence simulations", Journal of Computational Physics, 2009,

Wahid Bhimji

2016

Evan Racah, Seyoon Ko, Peter Sadowski, Wahid Bhimji, Craig Tull, Sang-Yun Oh, Pierre Baldi, Prabhat, "Revealing Fundamental Physics from the Daya Bay Neutrino Experiment using Deep Neural Networks", ICMLA, 2016,

Debbie Bard, Wahid Bhimji, David Paul, Glenn K Lockwood, Nicholas J Wright, Katie Antypas, Prabhat Prabhat, Steve Farrell, Andrey Ovsyannikov, Melissa Romanus, others, "Experiences with the Burst Buffer at NERSC", Supercomputing Conference, November 16, 2016, LBNL LBNL-1007120,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

Mostofa Patwary, Nadathur Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey, "PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures", IPDPS 2016, April 5, 2016,

Yun (Helen) He, Wahid Bhimji, Cori: User Update, NERSC User Group Meeting, March 24, 2016,

2015

T. Maier, D. Benjamin, W. Bhimji, Elmsheuser, P. van Gemmeren, D. Malon, N. Krumnack, "ATLAS I/O performance optimization in as-deployed environments", J. Phys. Conf. Ser., 2015, 664:042033, doi: 10.1088/1742-6596/664/4/042033 ,

Michela Massimi, Wahid Bhimji, "Computer simulations and experiments: The case of the Higgs boson", Stud. Hist. Philos. Mod. Phys., 2015, 51:71-81, doi: 10.1016/j.shpsb.2015.06.003

Georges Aad, others (ATLAS and CMS Collaborations), "Combined Measurement of the Higgs Boson Mass in pp collisions at sqrt{s}=7 and 8 TeV with the ATLAS and CMS Experiments", Phys. Rev. Lett., 2015, 114:191803, doi: 10.1103/PhysRevLett.114.191803

Georges Aad, others (ATLAS Collaboration), "Identification of Boosted, Hadronically Decaying W and Comparisons with ATLAS Data Taken at sqrt(s) = 8 TeV", Submitted to Eur. Phys. J. C, 2015,

Johannes Paul Blaschke

2016

Johannes Blaschke, Maurice Maurer, Karthik Menon, Andreas Zoettl, Holger Stark, "Phase separation and coexistence of hydrodynamically interacting microswimmers", Soft Matter, 2016, 12:9821-9831, doi: 10.1039/C6SM02042A

2013

Klaus Roeller, Johannes Blaschke, Stephan Herminghaus, Juergen Vollmer, "Arrest of the flow of wet granular matter", Journal of Fluid Mechanics, 2013, 738:407--422, doi: 10.1017/jfm.2013.587

Johannes Blaschke, Juergen Vollmer, "Granular Brownian motors: Role of gas anisotropy and inelasticity", Phys. Rev. E, 2013, 87:040201, doi: 10.1103/PhysRevE.87.040201

2012

Johannes Blaschke, Tobias Lapp, Bjoern Hof, Juergen Vollmer, "Breath Figures: Nucleation, Growth, Coalescence, and the Size Distribution of Droplets", Phys. Rev. Lett., 2012, 109:068701, doi: 10.1103/PhysRevLett.109.068701

2009

Paul Martinsen, Johannes Blaschke, Rainer Kuennemeyer, Robert Jordan, "Accelerating Monte Carlo simulations with an NVIDIA graphics processor", Comput Phys Commun, 2009, 180:1983--1989, doi: 10.1016/j.cpc.2009.05.013

James F Botts

2016

Douglas M. Jacobsen, James F. Botts, and Yun (Helen) He, "SLURM. Our Way.", Cray User Group Meeting 2016, London, England, May 2016,

Douglas M. Jacobsen, James F. Botts, and Yun (Helen) He, SLURM. Our Way., Cray User Group Meeting 2016. London, England., May 12, 2016,

Scott Campbell

2015

Massimiliano Albanese, Michael Berry, David Brown, Scott Campbell, Stephen Crago, George Cybenko, Jon DeLapp, Christopher L. DeMarco, Jeff Draper, Manuel Egele, Stephan Eidenbenz, Tina Eliassi-Rad, Vergle Gipson, Ryan Goodfellow, Paul Hovland, Sushil Jajodia, Cliff Joslyn, Alex Kent, Sandy Landsberg, Larry Lanes, Carolyn Lauzon, Steven Lee, Sven Leyffer, Robert Lucas, David Manz, Celeste Matarazzo, Jackson R. Mayo, Anita Nikolich, Masood Parvania, Garrett Payer, Sean Peisert, Ali Pinar, Thomas Potok, Stacy Prowell, Eric Roman, David Sarmanian, Dylan Schmorrow, Chris Strasburg, V.S. Subrahmanian, Vipin Swarup, Brian Tierney, Von Welch, "ASCR Cybersecurity for Scientific Computing Integrity", DOE Workshop Report, January 7, 2015,

At the request of the U.S. Department of Energy’s (DOE) Advanced Scientific Computing Research (ASCR) program, a workshop was held January 7–9, 2015, in Rockville, Md., to examine computer security research gaps and approaches for assuring scientific computing integrity specific to the mission of the DOE Office of Science. Issues included research computation and simulation that takes place on ASCR computing facilities and networks, as well as network-connected scientific instruments, such as those run by other DOE Office of Science programs. Workshop participants included researchers and operational staff from DOE national laboratories, as well as academic researchers and industry experts. Participants were selected based on the prior submission of abstracts relating to the topic. Additional input came from previous DOE workshop reports [DOE08,BB09] relating to security. Several observers from DOE and the National Science Foundation also attended.

2014

Scott Campbell, "Open Science, Open Security", 9th International Workshop on Security and High Performance Computing Systems, July 22, 2014,

 

We propose that to address the growing problems with complexity and data volumes in HPC security wee need to refactor how we look at data by creating tools that not only select data, but analyze and represent it in a manner well suited for intuitive analysis. We propose a set of rules describing what this means, and provide a number of production quality tools that represent our current best effort in implementing these ideas.

 

Michael Bailey, Scott Campbell, Michael Corn, Deborah A. Frincke, Ardoth Hassler, Craig Jackson, James A. Marsteller, Rodney J. Petersen, Mark Servilla, Von Welch, "Report of the 2013 NSF Cybersecurity Summit for Cyberinfrastructure and Large Facilities Designing Cybersecurity Programs in Support of Science", February 5, 2014,

2012

Gemmill, Jill, et al, "Security at the Cyberborder, Workshop Report", March 28, 2012,

Scott Campbell, Jason Lee, "Prototyping a 100G Monitoring System", 20th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP 2012), February 12, 2012,

The finalization of the 100 Gbps Ethernet Specification has been a tremendous increase in these rates arriving into data centers creating the need to perform security monitoring at 100 Gbps no longer simply an academic exercise. We show that by leveraging the ‘heavy tail flow effect’ on the IDS infrastructure, it is possible to perform security analysis at such speeds within the HPC environment. Additionally, we examine the nature of current traffic characteristics, how to scale an IDS infrastructure to 100Gbps.

2011

Katherine Yelick, Susan Coghlan, Brent Draney, Richard Shane Canon, Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Anping Liu, Scott Campbell, Piotr T. Zbiegiel, Tina Declerck, Paul Rich, "The Magellan Report on Cloud Computing for Science", U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), December 2011,

Scott Campbell, Jason Lee, "Intrusion Detection at 100G", The International Conference for High Performance Computing, Networking, Storage, and Analysis, November 14, 2011,

Driven by the growing data transfer needs of the scientific community and the standardization of the 100 Gbps Ethernet Specification, 100 Gbps is now becoming a reality for many HPC sites. This tenfold increase in bandwidth creates a number of significant technical challenges. We show that by using the heavy tail flow effect as a filter, it should be possible to perform active IDS analysis at this traffic rate using a cluster of commodity systems driven by a dedicated load balancing mechanism. Additionally, we examine the nature of current network traffic characteristics applying them to 100Gpbs speeds

Lavanya Ramakrishnan, Piotr T. Zbiegel, Scott Campbell, Rick Bradshaw, Richard Shane Canon, Susan Coghlan, Iwona Sakrejda, Narayan Desai, Tina Declerck, Anping Liu, "Magellan: Experiences from a Science Cloud", Proceedings of the 2nd International Workshop on Scientific Cloud Computing, ACM ScienceCloud '11, Boulder, Colorado, and New York, NY, 2011, 49 - 58,

Scott Campbell, Steve Chan and Jason Lee, "Detection of Fast Flux Service Networks", Australasian Information Security Conference 2011, January 17, 2011,

Fast Flux Service Networks (FFSN) utilize high availability server techniques for malware distribution. FFSNs are similar to commercial content distribution networks (CDN), such as Akamai, in terms of size, scope, and business model, serving as an outsourced content delivery service for clients.  Using an analysis of DNS traffic, we derive a sequential hypothesis testing algorithm based entirely on traffic characteristics and dynamic white listing to provide real time detection of FFDNs in live traffic.  We improve on existing work, providing faster and more accurate detection of FFSNs. We also identify a category of hosts not addressed in previous detectors - Open Content Distribution Networks (OCDN) that share many of the characteristics of FFSNs

Scott Campbell, "Local System Security via SSHD Instrumentation", USENIX LISA, January 1, 2011,

2009

Scott Campbell, Defense against the cyber dark arts, netWorker, Pages: 40--ff 2009,

Juan Meza, Scott Campbell, David Bailey, Mathematical and Statistical Opportunities in Cyber Security, arXiv preprint arXiv:0904.1616, 2009,

Scott Campbell, The Last Word-Defense Against the Cyber Dark Arts-Don t expect white or black hats or orderly distinction between legal and illegal intent in an attacker s motivation., netWorker: The Craft of Network Computing, Pages: 39 2009,

2006

E. Wes Bethel, Scott Campbell, Eli Dart, Jason Lee, Steven A. Smith, Kurt Stockinger, Brian Tierney, Kesheng Wu, "Interactive Analysis of Large Network Data Collections Using Query-Driven Visualization", DOE Report, September 26, 2006, LBNL 59166,

Realizing operational analytics solutions where large and complex data must be analyzed in a time-critical fashion entails integrating many different types of technology. Considering the extreme scale of contemporary datasets, one significant challenge is to reduce the duty cycle in the analytics discourse process. This paper focuses on an interdisciplinary combination of scientific data management and visualization/analysistechnologies targeted at reducing the duty cycle in hypothesis testing and knowledge discovery. We present an application of such a combination in the problem domain of network traffic dataanalysis. Our performance experiment results, including both serial and parallel scalability tests, show that the combination can dramatically decrease the analytics duty cycle for this particular application. The combination is effectively applied to the analysis of network traffic data to detect slow and distributed scans, which is a difficult-to-detect form of cyberattack. Our approach is sufficiently general to be applied to a diverse set of data understanding problems as well as used in conjunction with a diverse set of analysis and visualization tools

Stephen Q Lau, Scott Campbell, William T Kramer, Brian L Tierney, Computing protection in open HPC environments, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, Pages: 207 2006,

Kurt Stockinger, E Bethel, Scott Campbell, Eli Dart, Kesheng Wu, Detecting distributed scans using high-performance query-driven visualization, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, Pages: 82 2006,

E Wes Bethel, Scott Campbell, Eli Dart, Kurt Stockinger, Kesheng Wu, Accelerating network traffic analytics using query-driven visualization, Visual Analytics Science And Technology, 2006 IEEE Symposium On, Pages: 115--122 2006,

E Bethel, Scott Campbell, Eli Dart, John Shalf, Kurt Stockinger, Kesheng Wu, High Performance Visualization Using Query-Driven Visualization and Analytics, 2006,

Scott Campbell, How to think about security failures, Communications of the ACM, Pages: 37--39 2006,

2005

Kurt Stockinger, Kesheng Wu, Scott Campbell, Stephen Lau, Mike Fisk, Eugene Gavrilov, Alex Kent, Christopher E Davis, Rick Olinger, Rob Young, others, Network Traffic Analysis With Query Driven Visualization SC 2005 HPC Analytics Results, Proceedings of the 2005 ACM/IEEE conference on Supercomputing, Pages: 72 2005,

Daan Camps

2014

Borbala Hunyadi, Daan Camps, Laurent Sorber, Wim Van Paesschen, Maarten De Vos, Sabine Van Huffel, Lieven De Lathauwer, "Block term decomposition for modelling epileptic seizures", EURASIP Journal on Advances in Signal Processing, 2014, 2014, doi: 10.1186/1687-6180-2014-139

Richard Shane Canon

2016

Adam P. Arkin et al, The DOE Systems Biology Knowledgebase (KBase), Biorxiv, December 22, 2016, doi: 10.1101/096354

Shane Canon, Doug Jacobsen, "Shifter: Containers for HPC", Cray User Group, London, England, May 13, 2016,

Container-based computed is rapidly changing the way software is developed, tested, and deployed. This paper builds on previously presented work on a prototype framework for running containers on HPC platforms. We will present a detailed overview of the design and implementation of Shifter, which in partnership with Cray has extended on the early prototype concepts and is now in production at NERSC. Shifter enables end users to execute containers using images constructed from various methods including the popular Docker-based ecosystem. We will discuss some of the improvements over the initial prototype including an improved image manager, integration with SLURM, integration with the burst buffer, and user controllable volume mounts. In addition, we will discuss lessons learned, performance results, and real-world use cases of Shifter in action. We will also discuss the potential role of containers in scientific and technical computing including how they complement the scientific process. We will conclude with a discussion about the future directions of Shifter.

Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon,
Alex Gittens, Lisa Gerhardt, Suren Byna, Mike F. Ringenburg, Prabhat,
"H5Spark: Bridging the I/O Gap between Spark and Scientific Data Formats on HPC Systems", Cray User Group, May 13, 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

2015

N.J. Wright, S. S. Dosanjh, A. K. Andrews, K. Antypas, B. Draney, R.S Canon, S. Cholia, C.S. Daley, K. M. Fagnan, R.A. Gerber, L. Gerhardt, L. Pezzaglia, Prabhat, K.H. Schafer, J. Srinivasan, "Cori: A Pre-Exascale Computer for Big Data and HPC Applications", Big Data and High Performance Computing 26 (2015): 82., ( June 2015) doi: 10.3233/978-1-61499-583-8-82

Extreme data science is becoming increasingly important at the U.S. Department of Energy's National Energy Research Scientific Computing Center (NERSC). Many petabytes of data are transferred from experimental facilities to NERSC each year. Applications of importance include high-energy physics, materials science, genomics, and climate modeling, with an increasing emphasis on large-scale simulations and data analysis. In response to the emerging data-intensive workloads of its users, NERSC made a number of critical design choices to enhance the usability of its pre-exascale supercomputer, Cori, which is scheduled to be delivered in 2016. These data enhancements include a data partition, a layer of NVRAM for accelerating I/O, user defined images and a customizable gateway for accelerating connections to remote experimental facilities.

Doug Jacobsen, Shane Canon, Contain This, Unleashing Docker for HPC, NERSC Webcast, May 15, 2015,

Doug Jacobsen, Shane Canon, "Contain This, Unleashing Docker for HPC", Cray User Group 2015, April 23, 2015,

2014

Justin Blair, Richard S. Canon, Jack Deslippe, Abdelilah Essiari, Alexander Hexemer, Alastair A. MacDowell, Dilworth Y. Parkinson, Simon J. Patton, Lavanya Ramakrishnan, Nobumichi Tamura, Brian L. Tierney, Craig E. Tull, "High performance data management and analysis for tomography", Proc. SPIE 9212, Developments in X-Ray Tomography IX, September 12, 2014,

S. Parete-Koon, B. Caldwell, S. Canon, E. Dart, J. Hick, J. Hill, C. Layton, D. Pelfrey, G. Shipman, D. Skinner, J. Wells, J. Zurawski, "HPC's Pivot to Data", Conference, May 5, 2014,

 

Computer centers such as NERSC and OLCF have traditionally focused on delivering computational capability that enables breakthrough innovation in a wide range of science domains. Accessing that computational power has required services and tools to move the data from input and output to computation and storage. A pivot to data is occurring in HPC. Data transfer tools and services that were previously peripheral are becoming integral to scientific workflows.  Emerging requirements from high-bandwidth detectors, highthroughput screening techniques, highly concurrent simulations, increased focus on uncertainty quantification, and an emerging open-data policy posture toward published research are among the data-drivers shaping the networks, file systems, databases, and overall HPC environment. In this paper we explain the pivot to data in HPC through user requirements and the changing resources provided by HPC with particular focus on data movement. For WAN data transfers we present the results of a study of network performance between centers

 

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

2013

Jay Srinivasan, Richard Shane Canon, "Evaluation of A Flash Storage Filesystem on the Cray XE-6", CUG 2013, May 2013,

Flash storage and other solid-state storage technolo-gies are increasingly being considered as a way to address the growing gap between computation and I/O. Flash storage has a number of benefits such as good random read performance and lower power consumption. However, it has a number of challenges too, such as high cost and high-overhead for write operations. There are a number of ways Flash can be integrated into HPC systems. This paper will discuss some of the approaches and show early results for a Flash file system mounted on a Cray XE-6 using high-performance PCI-e based cards. We also discuss some of the gaps and challenges in integrating flash intoHPC systems and potential mitigations as well as new solid state storage technologies and their likely role in the future

David Skinner and Shane Canon, NERSC and High Throughput Computing, February 12, 2013,

You-Wei Cheah, Richard Canon, Plale, Lavanya Ramakrishnan, "Milieu: Lightweight and Configurable Big Data Provenance for Science", BigData Congress, 2013, 46-53,

Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane Canon, Anping Liu, Devarshi Ghoshal, Krishna Muriki, Nicholas J. Wright, "Magellan - A Testbed to Explore Cloud Computing for Science", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: 2013)

You-Wei Cheah, Richard Canon, Beth Plale, Lavanya Ramakrishnan, "Milieu: Lightweight and Configurable Big Data Provenance for Science", IEEE Big Data Congress, 2013,

Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane Canon, Anping Liu, Devarshi Ghoshal, Krishna Muriki and Nicholas J. Wright, "CAMP", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: January 1, 2013)

2012

Elif Dede, Fadika, Hartog, Govindaraju, Ramakrishnan, Gunter, Shane Richard Canon, "MARISSA: MApReduce Implementation for Streaming Science Applications", eScience, October 8, 2012, 1-8,

Zacharia Fadika, Madhusudhan Govindaraju, Shane Richard Canon, Lavanya Ramakrishnan, "Evaluating Hadoop for Data-Intensive Scientific Operations", IEEE Cloud 2012, June 24, 2012,

Emerging sensor networks, more capable instruments, and ever increasing simulation scales are generating data at a rate that exceeds our ability to effectively manage, curate, analyze, and share it. Data-intensive computing is expected to revolutionize the next-generation software stack. Hadoop, an open source implementation of the MapReduce model provides a way for large data volumes to be seamlessly processed through use of large commodity computers. The inherent parallelization, synchronization and fault-tolerance the model offers, makes it ideal for highly-parallel data-intensive applications. MapReduce and Hadoop have traditionally been used for web data processing and only recently been used for scientific applications. There is a limited understanding on the performance characteristics that scientific data intensive applications can obtain from MapReduce and Hadoop. Thus, it is important to evaluate Hadoop specifically for data-intensive scientific operations -- filter, merge and reorder-- to understand its various design considerations and performance trade-offs. In this paper, we evaluate Hadoop for these data operations in the context of High Performance Computing (HPC) environments to understand the impact of the file system, network and programming modes on performance.

Jay Srinivasan, Richard Shane Canon, Lavanya Ramakrishnan, "My Cray can do that? Supporting Diverse Workloads on the Cray XE-6", CUG 2012, May 2012,

The Cray XE architecture has been optimized to support tightly coupled MPI applications, but there is an in- creasing need to run more diverse workloads in the scientific and technical computing domains. These needs are being driven by trends such as the increasing need to process “Big Data”. In the scientific arena, this is exemplified by the need to analyze data from instruments ranging from sequencers, telescopes, and X-ray light sources. These workloads are typically throughput oriented and often involve complex task dependencies. Can platforms like the Cray XE line play a role here? In this paper, we will describe tools we have developed to support high-throughput workloads and data intensive applications on NERSC’s Hopper system. These tools include a custom task farmer framework, tools to create virtual private clusters on the Cray, and using Cray’s Cluster Compatibility Mode (CCM) to support more diverse workloads. In addition, we will describe our experience with running Hadoop, a popular open-source implementation of MapReduce, on Cray systems. We will present our experiences with this work including successes and challenges. Finally, we will discuss future directions and how the Cray platforms could be further enhanced to support these class of workloads.

Richard Shane Canon, Magellan Project: Clouds for Science?, Coalition for Academic Scientific Computation, February 29, 2012,

This presentation gives a brief overview of the Magellan Project and some of its findings.

2011

Katherine Yelick, Susan Coghlan, Brent Draney, Richard Shane Canon, Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Anping Liu, Scott Campbell, Piotr T. Zbiegiel, Tina Declerck, Paul Rich, "The Magellan Report on Cloud Computing for Science", U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), December 2011,

Ghoshal, Devarshi and Canon, Richard Shane and Ramakrishnan, Lavanya, "Understanding I/O Performance of Virtualized Cloud Environments", The Second International Workshop on Data Intensive Computing in the Clouds (DataCloud-SC11), 2011,

We compare the I/O performance using IOR benchmarks on two cloud computing platforms - Amazon and the Magellan cloud testbed.

Richard Shane Canon, Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities, Metagenomics Informatics Challenges Workshop, October 12, 2011,

Lavanya Ramakrishnan & Shane Canon, NERSC, Hadoop and Pig Overview, October 2011,

The MapReduce programming model and its open source implementation Hadoop is gaining traction in the scientific community for addressing the needs of data focused scientific applications. The requirements of these scientific applications are significantly different from the web 2.0 applications that have  traditionally used Hadoop. The tutorial  will provide an overview of Hadoop technologies, discuss some use cases of Hadoop for science and present the programming challenges with using Hadoop for legacy applications. Participants will access the Hadoop system at NERSC for the hands-on component of the tutorial.

Lavanya Ramakrishnan, Richard Shane Canon, Krishna Muriki, Iwona Sakrejda, and Nicholas J. Wright., "Evaluating Interconnect and Virtualization Performance for High Performance Computing", Proceedings of 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS11), 2011,

In this paper we detail benchmarking results that characterize the virtualization overhead and its impact on performance. We also examine the performance of various interconnect technologies with a view to understanding the performance impacts of various choices. Our results show that virtualization can have a significant impact upon performance, with at least a 60% performance penalty. We also show that less capable interconnect technologies can have a significant impact upon performance of typical HPC applications. We also evaluate the performance of the Amazon Cluster compute instance and show that it performs approximately equivalently to a 10G Ethernet cluster at low core counts.

Lavanya Ramakrishnan, Piotr T. Zbiegel, Scott Campbell, Rick Bradshaw, Richard Shane Canon, Susan Coghlan, Iwona Sakrejda, Narayan Desai, Tina Declerck, Anping Liu, "Magellan: Experiences from a Science Cloud", Proceedings of the 2nd International Workshop on Scientific Cloud Computing, ACM ScienceCloud '11, Boulder, Colorado, and New York, NY, 2011, 49 - 58,

Shane Canon, Debunking Some Common Misconceptions of Science in the Cloud, ScienceCloud 2011, June 29, 2011,

This presentation addressed five common misconceptions of cloud computing including: clouds are simple to use and don’t require system administrators; my job will run immediately in the cloud; clouds are more efficient; clouds allow you to ride Moore’s Law without additional investment; commercial Clouds are much cheaper than operating your own system.

2010

Neal Master, Matthew Andrews, Jason Hick, Shane Canon, Nicholas J. Wright, "Performance Analysis of Commodity and Enterprise Class Flash Devices", Petascale Data Storage Workshop (PDSW), November 2010,

Keith R. Jackson, Ramakrishnan, Muriki, Canon, Cholia, Shalf, J. Wasserman, Nicholas J. Wright, "Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud", CloudCom, January 1, 2010, 159-168,

Lavanya Ramakrishnan, R. Jackson, Canon, Cholia, John Shalf, "Defining future platform requirements for e-Science clouds", SoCC, 2010, 101-106,

Kesheng Wu, Kamesh Madduri, Shane Canon, "Multi-Level Bitmap Indexes for Flash Memory Storage", IDEAS '10: Proceedings of the Fourteenth International Database Engineering and Applications Symposium, Montreal, QC, Canada, 2010,

2009

Richard Shane Canon, Cosmic Computing: Supporting the Science of the Planck Space Based Telescope, LISA 2009, November 5, 2009,

The scientific community is creating data at an ever-increasing rate. Large-scale experimental devices such as high-energy collider facilities and advanced telescopes generate petabytes of data a year. These immense data streams stretch the limits of the storage systems and of their administrators. The Planck project, a space-based telescope designed to study the Cosmic Microwave Background, is a case in point. Launched in May 2009, the Planck satellite will generate a data stream requiring a network of storage and computational resources to store and analyze the data. This talk will present an overview of the Planck project, including the motivation and mission, the collaboration, and the terrestrial resources supporting it. It will describe the data flow and network of computer resources in detail and will discuss how the various systems are managed. Finally, it will highlight some of the present and future challenges in managing a large-scale data system.

Jonathan Carter

2011

Katie Antypas, Tina Butler, Jonathan Carter, "The Hopper System: How the Largest XE6 in the World went from Requirements to Reality", Cray User Group Proceedings, May 31, 2011,

Samuel Williams, Oliker, Carter, John Shalf, "Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning", SC, January 1, 2011, 55,

2010

S. Ethier, M. Adams, J. Carter, L. Oliker, "Petascale Parallelization of the Gyrokinetic Toroidal Code", VECPAR: High Performance Computing for Computational Science, June 2010,

K. Datta, S. Williams, V. Volkov, J. Carter, L. Oliker, J. Shalf, K. Yelick, "Auto-Tuning Stencil Computations on Diverse Multicore Architectures", Scientific Computing with Multicore and Accelerators, edited by Jakub Kurzak, David A. Bader, Jack Dongarra, 2010,

2008

Yun (Helen) He, William T.C. Kramer, Jonathan Carter, and Nicholas Cardo, Franklin: User Experiences, CUG User Group Meeting 2008, May 5, 2008,

Yun (Helen) He, William T.C. Kramer, Jonathan Carter, and Nicholas Cardo, "Franklin: User Experiences", Cray User Group Meetin 2008, May 4, 2008, LBNL 2014E,

The newest workhorse of the National Energy Research Scientific Computing Center is a Cray XT4 with 9,736 dual core nodes. This paper summarizes Franklin user experiences from friendly early user period to production period. Selected successful user stories along with top issues affecting user experiences are presented.

 

Kaushik Datta, Murphy, Volkov, Williams, Carter, Oliker, A. Patterson, Shalf, Katherine A. Yelick, "Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures", SC, January 1, 2008, 4,

Leonid Oliker, Canning, Carter, Shalf, St\ ephane Ethier, "Scientific Application Performance On Leading Scalar and Vector Supercomputering Platforms", IJHPCA, January 1, 2008, 22:5-20,

2007

Jonathan Carter, Yun (Helen) He, John Shalf, Hongzhang Shan, Erich Strohmaier, and Harvey Wasserman, "The Performance Effect of Multi-Core on Scientific Applications", Cray User Group 2007, May 2007, LBNL 62662,

The historical trend of increasing single CPU performance has given way to roadmap of increasing core count. The challenge of effectively utilizing these multi- core chips is just starting to be explored by vendors and application developers alike. In this study, we present some performance measurements of several complete scientific applications on single and dual core Cray XT3 and XT4 systems with a view to characterizing the effects of switching to multi-core chips. We consider effects within a node by using applications run at low concurrencies, and also effects on node- interconnect interaction using higher concurrency results. Finally, we construct a simple performance model based on the principle on-chip shared resource—memory bandwidth—and use this to predict the performance of the forthcoming quad-core system.

 

Jonathan Carter, Helen He*, John Shalf, Erich Strohmaier, Hongzhang Shan, and Harvey Wasserman, The Performance Effect of Multi-Core on Scientific Applications, Cray User Group 2007, May 2007,

J. Levesque, J. Larkin, M. Foster, J. Glenski, G. Geissler, S. Whalen, B. Waldecker, J. Carter, D. Skinner, H. He, H. Wasserman, J. Shalf, H. Shan, "Understanding and mitigating multicore performance issues on the AMD opteron architecture", March 1, 2007, LBNL 62500,

Over the past 15 years, microprocessor performance has doubled approximately every 18 months through increased clock rates and processing efficiency. In the past few years, clock frequency growth has stalled, and microprocessor manufacturers such as AMD have moved towards doubling the number of cores every 18 months in order to maintain historical growth rates in chip performance. This document investigates the ramifications of multicore processor technology on the new Cray XT4systems based on AMD processor technology. We begin by walking through the AMD single-core and dual-core and upcoming quad-core processor architectures. This is followed by a discussion of methods for collecting performance counter data to understand code performance on the Cray XT3and XT4systems. We then use the performance counter data to analyze the impact of multicore processors on the performance of microbenchmarks such as STREAM, application kernels such as the NAS Parallel Benchmarks, and full application codes that comprise the NERSC-5 SSP benchmark suite. We explore compiler options and software optimization techniques that can mitigate the memory bandwidth contention that can reduce computing efficiency on multicore processors. The last section provides a case study of applying the dual-core optimizations to the NAS Parallel Benchmarks to dramatically improve their performance.1

 

Leonid Oliker, Canning, Carter, Iancu, Lijewski, Kamil, Shalf, Shan, Strohmaier, Ethier, Tom Goodale, "Scientific Application Performance on Candidate PetaScale Platforms", IPDPS, January 1, 2007, 1-12,

2006

Jonathan Carter, Tony Drummond, Parry Husbands, Paul Hargrove, Bill Kramer, Osni Marques, Esmond Ng, Lenny Oliker, John Shalf, David Skinner, Kathy Yelick, "Software Roadmap to Plug and Play Petaflop/s", Lawrence Berkeley National Laboratory Technical Report, #59999, July 31, 2006,

Jonathan Carter, Oliker, John Shalf, "Performance Evaluation of Scientific Applications on Modern Parallel Vector Systems", VECPAR, January 1, 2006, 490-503,

L. Oliker, S. Kamil, A. Canning, J. Carter, C. Iancu, J. Shalf, H. Shan, D. Skinner, E. Strohmaier, T. Goodale, "Application Scalability and Communication Signatures on Leading Supercomputing Platforms", January 1, 2006,

2005

Horst D. Simon, William T. C. Kramer, David H. Bailey, Michael J. Banda, E. Wes Bethel, Jonathon T. Carter, James M. Craw, William J. Fortney, John A. Hules, Nancy L. Meyer, Juan C. Meza, Esmond G. Ng, Lynn E. Rippe, William C. Saphir, Francesca Verdier, Howard A. Walter, Katherine A. Yelick, "Science-Driven Computing: NERSC’s Plan for 2006–2010", LBNL Technical Report 57582, 2005,

Leonid Oliker, Canning, Carter, Shalf, Skinner, Ethier, Biswas, Jahed Djomehri, Rob F. Van der Wijngaart, "Performance evaluation of the SX-6 vector architecture for scientific computations", Concurrency - Practice and Experience, January 1, 2005, 17:69-93,

2004

Leonid Oliker, Canning, Carter, Shalf, St\ ephane Ethier, "Scientific Computations on Modern Parallel Vector Systems", SC, January 1, 2004, 10,

2003

Leonid Oliker, Canning, Carter, Shalf, Skinner, Ethier, Biswas, Jahed Djomehri, Rob F. Van der Wijngaart, "Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations", SC, January 1, 2003, 38,

Ravi Cheema

2015

J. Hick, R. Lee, R. Cheema, K. Fagnan, GPFS for Life Sciences at NERSC, GPFS User Group Meeting, May 20, 2015,

A report showing both high and low-level changes made to our life sciences workloads to support them on GPFS file systems.

Tiffany Connors

2016

Tiffany A. Connors, Ritu Arora, "A Scalable Approach for Topic Modeling with R", Supercomputing Conference (SC) 2016, November 15, 2016,

Ritu Arora, Trung Nguyen Ba, Tiffany A. Connors, "Pecos: A scalable solution for analyzing and managing qualitative data", DataCloud '16 Proceedings of the 7th International Workshop on Data-Intensive Computing in the Cloud, November 13, 2016,

2015

Tiffany Connors, Apan Qasem, "Power-performance analysis of metaheuristic search algorithms on the GPU", 2015 Sixth International Green and Sustainable Computing Conference (IGSC), December 14, 2015,

"Modeling the Impact of Thread Configuration on Power and Performance of GPUs", Supercomputing Conference (SC) 2015, November 16, 2015,

Brandon Cook

2016

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, Volume 9945, October 6, 2016, doi: 10.1007/978-3-319-46079-6_24

Brandon Cook, Pieter Maris, Meiyue Shao, Nathan Wichmann, Marcus Wagner, John O'Neill, Thanh Phung, Gaurav Bansal, "High Performance Optimizations for Nuclear Physics Code MFDn on KNL", ISC Workshops, October 6, 2016, doi: 10.1007/978-3-319-46079-6_26

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, OpenMP Parallelization and Optimization of Graph-based Machine Learning Algorithms, IWOMP 2016, October 6, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, MPI usage at NERSC: Present and Future, EuroMPI 2016, September 26, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, "MPI usage at NERSC: Present and Future", EuroMPI 2016, Edinburgh, Scotland, UK, September 26, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, "MPI usage at NERSC: Present and Future", EuroMPI 2016, September 26, 2016,

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, "OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms", Lecture Notes in Computer Science, Springer, 2016, 9903:17-31, doi: 10.1007/978-3-319-45550-1_2

2015

Uma Tumuluri, Meijun Li, Brandon Cook, Bobby Sumpter, Sheng Dai, Zili Wu, "Surface Structure Dependence of SO2 Interaction with Ceria Nanocrystals with Well-defined Surface Facets", The Journal of Physical Chemistry C, 2015,

Jia-An Yan, Mack A Dela Cruz, Brandon Cook, Kalman Varga, "Structural, electronic and vibrational properties of few-layer 2H-and 1T-TaSe2", Scientific Reports, 2015,

Brandon Cook, Arthur Russakoff, Kálmán Varga, "Coverage dependent work function of graphene on a Cu (111) substrate with intercalated alkali metals", Applied Physics Letters, 2015,

2012

Brandon Cook, William R French, Kálmán Varga, "Electron transport properties of carbon nanotube–graphene contacts", Applied Physics Letters, 2012,

2011

Christopher R Iacovella, William R French, Brandon Cook, Paul RC Kent, Peter T Cummings, "Role of polytetrahedral structures in the elongation and rupture of gold nanowires", ACS Nano, 2011,

Brandon Cook, Peter Dignard, Kálmán Varga, "Calculation of electron transport in multiterminal systems using complex absorbing potentials", Physical Review B, May 16, 2011,

2006

Brandon Cook, John Eric Goff, "Parameter space for successful soccer kicks", European Journal of Physics, 2006,

Matthew Cordery

2014

Yu Jung Lo, Samuel Williams, Brian Van Straalen, Terry J. Ligocki,Matthew J. Cordery, Nicholas J. Wright, Mary W. Hall, Leonid Oliker, "Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis", SC'14, November 16, 2014,

M. J. Cordery, B. Austin, H. J. Wasserman, C. S. Daley, N. J. Wright, S. D. Hammond, D. Doerfler, "Analysis of Cray XC30 Performance using Trinity-NERSC-8 benchmarks and comparison with Cray XE6 and IBM BG/Q", High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation (PMBS 2013). Lecture Notes in Computer Science, Volume 8551, October 1, 2014,

2013

Brian Austin, Matthew Cordery, Harvey Wasserman, Nicholas J. Wright, "Performance Measurements of the NERSC Cray Cascade System", 2013 Cray User Group Meeting, May 9, 2013,

1997

Cordery, M.J., G. F. Davies, and I.H. Campbell, "Genesis of flood basalts from eclogite-bearing mantle plumes", 1997, doi: 10.1029/97JB00648

1993

Cordery, M.J., J. Phipps Morgan, "Convection and melting at mid-ocean ridges", Journal of Geophysical Research, 1993, doi: 10.1029/97JB00648

1992

Cordery, M.J. and J. Phipps Morgan, "Melting and mantle flow at a mid-cean spreading center", Earth and Planetary Science Letters, 1992, doi: 10.1016/0012-821X(92)90199-6

1989

Von Herzen, R.P., M.J. Cordery, R.S. Detrick, and C. Fang, "Heat flow and the thermal origin of hotspot swells", Journal of Geophysical Research, 1989, doi: 10.1029/JB094iB10p13783

Joaquin Correa

2016

Dilworth Y. Parkinson, Keith Beattie, Xian Chen, Joaquin Correa, Eli Dart, Benedikt J. Daurer, Jack R. Deslippe, Alexander Hexemer, Harinarayan Krishnan, Alastair A. MacDowell, Filipe R. N. C. Maia, Stefano Marchesini, Howard A. Padmore, Simon J. Patton, Talita Perciano, James A. Sethian, David Shapiro, Rune Stromsness, Nobumichi Tamura, Brian L. Tierney, Craig E. Tull, Daniela Ushizima, "Real-time data-intensive computing.", AIP Conference Proceedings, July 2016, 1741,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

SV Venkatakrishnan, K Aditya Mohan, Keith Beattie, Joaquin Correa, Eli Dart, Jack R Deslippe, Alexander Hexemer, Harinarayan Krishnan, Alastair A MacDowell, Stefano Marchesini, Simon J Patton, Talita Perciano, James A Sethian, Rune Stromsness, Brian L Tierney, Craig E Tull, Daniela Ushizima, Dilworth Y Parkinson, "Making Advanced Scientific Algorithms and Big Scientific Data Management More Accessible", Electronic Imaging, February 14, 2016, 2016 Is.:1,

2014

Wen-Ting Tsai, Ahmed Hassan, Purbasha Sarkar, Joaquin Correa, Zoltan Metlagel, Danielle M. Jorgens, Manfred Auer, "From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data", August 13, 2014, doi: 10.3791/51673

The bottleneck for cellular 3D electron microscopy is feature extraction (segmentation) in highly complex 3D density maps. We have developed a set of criteria, which provides guidance regarding which segmentation approach (manual, semi-automated, or automated) is best suited for different data types, thus providing a starting point for effective segmentation.

Correa, J., Skinner D., Auer M., "Integrated tools for next generation bioimaging", Conference in Medical Image Understanding and Analysis (MIUA) | Royal Holloway, Egham, July 9, 2014,

Correa, J., OME Workshop (ybd), OME Users Meeting | Institut Pasteur, Paris, June 5, 2014,

Correa, J., "Enabling Science from Big Image Data using Cross Cutting Infrastructure", International Conference on Big Data Science and Computing | Stanford University, May 27, 2014,

Fox W., Correa J., Cholia S., Skinner D., Ophus C., "NCEM Hub, A Science Gateway for Electron Microscopy in Materials Science", LBNL Tech Report on NCEMhub, May 1, 2014,

Electron microscopy (EM) instrumentation is making a detector-driven transition to Big Data. High capability cameras bring new resolving power but also an exponentially increasing demand for bandwidth and data analysis. In practical terms this means that users of advanced microscopes find it increasingly challenging to take data with them and instead need an integrated data processing pipeline. in 2013 NERSC and NCEM staff embarked on a pilot to prototype data services that provide such a pipeline. This tech report details the NCEM Hub pilot as it concluded in May 2014.

Joaquin Correa, Scalable web-based computational bioimaging solutions at NERSC, CryoEM meeting - SLAC National Accelerator Laboratory | Stanford University, April 17, 2014,

Joaquin Correa, David Skinner, "BIG DATA BIOIMAGING: Advances in Analysis, Integration, and Dissemination", Keystone Symposia on Molecular and Cellular Biology, March 24, 2014,

A fundamental problem currently for the biological community is to adapt computational solutions known broadly in data-centric science toward the specific challenges of data scaling in bioimaging. In this work we target software solutions fit for these tasks which leverages success in large scale data-centric science outside of bioimaging.

2013

Joaquin Correa, Integrated Tools for NGBI--Lessons Learned and Successful Cases, LBNL Integrated Bioimaging Initiative, September 4, 2013,

 

NextGen Bioimaging (NGBI) requires a reliable and flexible solution for multi-modal, high-throughput and high-performance image processing and analysis. In order to solve this challenge, we have developed an OMERO-based modular and flexible platform that integrates a suite of general-purpose processing software, a set of custom-tailored algorithms, specific bio-imaging applications and NERSC's high performance computing resources and its science gateways.
This under-development platform provides a shared scalable one-stop-shop web-service for producers and consumers of models built on imaging data to refine pixel data into actionable knowledge resources.

 

2012

Jonathan P. Remis, Berhnhard Knierim, Amita Gorur, Ambrose Leung, Danielle M. Jorgens, Mitalee Desai, Monica Lin, David A. Ball, Roseann Csencsits, Jan Liphardt, Bill Consterton, Ken Downing, Phil Hugenholtz, Manfred Auer, "Microbial Communities: The Social Side of Bacteria from Macromolecules to Community Organization", PCAP - Protein Complex Analysis Project, 2012,

Purbasha Sarkar, Patanjali Varanasi, Lan Sun, Elene Bosneaga, Lina Prak, Bernhard Knierim, Marcin Zemla, Michael Joo, David Larson, Roseann Csencsits, Bahram Parvin, Kenneth H. Downing, Manfred Auer, "BioFuels 2.0: Plant cell walls - Towards rational cell wall engineering", EBI (Energy BioSciences Institute) and JBEI (Joint BioEnergy Institute), 2012,

Michael Joo, Roseann Csencsits, Adam Barnebey, Andrew Tauscher, Ahmed Hassan, Danielle Jorgens, Marcin Zemla, Romy Chartaborky, Gareth Butland, Jennifer He, David Stahl, Nicholas Elliot, Matthew Fields, Manfred Auer, Steven M. Yannone, "Microbial Metal Reduction: Metal Reduction & Structures in Microbes", ENIGMA: Ecosystems and Networks Integrated with Genes and Molecular Assemblies, 2012,

Jonathan P. Remis, Bernhar Knierim, Monica Lin, Amita Gorur, Mitalee Desai, Manfred Auer, W.J. Costerton, J. Berleman, Trent Northen, D. Wei, B. Van Leer, "Microbial Communities: A Tale of Social Bacteria and Nature's Solution to Biomass Recalcitrance", 2012,

James M. Craw

2009

James M. Craw, Nicholas P. Cardo, Yun (Helen) He, and Janet M. Lebens, "Post-Mortem of the NERSC Franklin XT Upgrade to CLE 2.1", Cray User Group Meeting 2009, Atlanta, GA, May 2009,

This paper will discuss the lessons learned of the events leading up to the production deployment of CLE 2.1 and the post install issues experienced in upgrading NERSC's XT4 system called Franklin.

 

James M. Craw, Nicholas P. Cardo, Yun (Helen) He, and Janet M. Lebens, Post-Mortem of the NERSC Franklin XT Upgrade to CLE 2.1, Cray User Group Meeting, May 2009,

2001

William T. C. Kramer, Wes Bethel, James Craw, Brent Draney, William Fortney, Brent Gorda, William Harris, Nancy Meyer, Esmond Ng, Francesca Verdier, Howard Walter, Tammy Welcome, "NERSC Strategic Implementation Plan 2002-2006", ”, LBNL Technical Report 5465 Vol. 2, 2001,

Christopher S. Daley

2016

C.S. Daley, D. Ghoshal, G.K. Lockwood, S. Dosanjh, L. Ramakrishnan, N.J. Wright, "Performance Characterization of Scientific Workflows for the Optimal Use of Burst Buffers", Workflows in Support of Large-Scale Science (WORKS-2016), CEUR-WS.org, 2016, 1800:69-73,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

2015

C.S. Daley, L. Ramakrishnan, S. Dosanjh, N.J. Wright, "Analyses of Scientific Workflows for Effective Use of Future Architectures", The 6th International Workshop on Big Data Analytics: Challenges, and Opportunities (BDAC-15), 2015,

J. Park, M. Smelyanskiy, K. Vaidyanathan, A. Heinecke, D Kalamkar, M Patwary, V. Pirogov, P. Dubey, X. Liu, C. Rosales, C. Mazauric, C. Daley, "Optimizations in a high-performance conjugate gradient benchmark for IA-based multi- and many-core processors", International Journal of High Performance Computing Applications, 2015, doi: 10.1177/1094342015593157

N.J. Wright, S. S. Dosanjh, A. K. Andrews, K. Antypas, B. Draney, R.S Canon, S. Cholia, C.S. Daley, K. M. Fagnan, R.A. Gerber, L. Gerhardt, L. Pezzaglia, Prabhat, K.H. Schafer, J. Srinivasan, "Cori: A Pre-Exascale Computer for Big Data and HPC Applications", Big Data and High Performance Computing 26 (2015): 82., ( June 2015) doi: 10.3233/978-1-61499-583-8-82

Extreme data science is becoming increasingly important at the U.S. Department of Energy's National Energy Research Scientific Computing Center (NERSC). Many petabytes of data are transferred from experimental facilities to NERSC each year. Applications of importance include high-energy physics, materials science, genomics, and climate modeling, with an increasing emphasis on large-scale simulations and data analysis. In response to the emerging data-intensive workloads of its users, NERSC made a number of critical design choices to enhance the usability of its pre-exascale supercomputer, Cori, which is scheduled to be delivered in 2016. These data enhancements include a data partition, a layer of NVRAM for accelerating I/O, user defined images and a customizable gateway for accelerating connections to remote experimental facilities.

Jack Deslippe, Brian Austin, Chris Daley, Woo-Sun Yang, "Lessons learned from optimizing science kernels for Intel's "Knights-Corner" architecture", CISE, April 1, 2015,

2014

M. J. Cordery, B. Austin, H. J. Wasserman, C. S. Daley, N. J. Wright, S. D. Hammond, D. Doerfler, "Analysis of Cray XC30 Performance using Trinity-NERSC-8 benchmarks and comparison with Cray XE6 and IBM BG/Q", High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation (PMBS 2013). Lecture Notes in Computer Science, Volume 8551, October 1, 2014,

A Dubey, K Antypas, AC Calder, C Daley, B Fryxell, JB Gallagher, DQ Lamb, D Lee, K Olson, LB Reid, P Rich, PM Ricker, KM Riley, R Rosner, A Siegel, NT Taylor, K Weide, FX Timmes, N Vladimirova, J ZuHone, "Evolution of FLASH, a multi-physics scientific simulation code for high-performance computing", The International Journal of High Performance Computing Applications, May 2014, 28:225--237, doi: 10.1177/1094342013505656

2013

P. Mohapatra, A Dubey, C. Daley, M. Vanella, and E. Balaras, "Parallel Algorithms for Using Lagrangian Markers in Immersed Boundary Method with Adaptive Mesh Refinement in FLASH", Computer Architecture and High Performance Computing (SBAC-PAD), October 2013, doi: 10.1109/SBAC-PAD.2013.27

A. Dubey, K. Weide, D. Lee, J. Bachan, C. Daley, S. Olofin, N. Taylor, P. M. Rich, and L. B. Reid, "Ongoing verification of a multiphysics community code: FLASH", Software: Practice and Experience, September 2013, doi: 10.1002/spe.2220

A. Dubey, A. Calder, C. Daley, C., R. Fisher, Jordan, D.Q. Lamb, L.B. Reid, D.M. Townsley, K. Weide, "Pragmatic Optimizations for Best Scientific Utilization of Large Supercomputers", International Journal of High Performance Computing Applications, July 2013, doi: 10.1177/1094342012464404

C. Daley, Preparing for Mira: experience with FLASH multiphysics simulations, Mira Community Conference, 2013,

2012

A. Dubey, C. Daley, J. ZuHone, P. M. Ricker, K. Weide, C. Graziani, "Imposing a Lagrangian Particle Framework on an Eulerian Hydrodynamics Infrastructure in FLASH", The Astrophysical Journal Supplements, 2012, 201:27, doi: 10.1088/0067-0049/201/2/27

C. Daley, J. Bachan, S. Couch, A. Dubey, M., B. Gallagher, D. Lee, K. Weide, "Adding shared memory parallelism to FLASH for many-core architectures", TACC-Intel Highly Parallel Computing Symposium, 2012,

C. Daley, M. Vanella, K. Weide, A. Dubey, E. Balaras, "Optimization of Multigrid Based Elliptic Solver for Large Scale Simulations the FLASH Code", Concurrency and Computation: Practice and Experience, 2012, 24:2346--2361, doi: 10.1002/cpe.2821

R. Latham, C. Daley, W.K. Liao, K. Gao, R. Ross, A. Dubey, A. Choudhary, "A case study for scientific I/O: improving the FLASH astrophysics code", Computational Science and Discovery, 2012, 5:015001, doi: 10.1088/1749-4699/5/1/015002

2011

D. Lee, G. Xia, C. Daley, A. Dubey, S., C. Graziani, D.Q. Lamb, K. Weide, "Progress in development of HEDP capabilities in FLASH's Unsplit Staggered Mesh MHD solver", Astrophysics and Space Science, 2011, 336:157-162, doi: 10.1007/s10509-011-0654-5

V. Vishwanath, M. Hereld, M. E. Papka, R. Hudson, G. Jordan, C. Daley, "In Situ Data Analytics and I/O Acceleration of FLASH simulations on leadership-class systems with GLEAN", SciDAC, Journal of Physics: Conference Series, 2011,

2010

A. Dubey, C. Daley, K. Weide, "Challenges of Computing with FLASH on Largest HPC Platforms", AIP Conference Proceedings, 2010, 1281:1773, doi: http://dx.doi.org/10.1063/1.3498219

2009

B. R. de Supinski, S. Alam, D. H. Bailey, L., C. Daley, A. Dubey, T., D. Gunter, P. D. Hovland, H., K. Karavanic, G. Marin, J., S. Moore, B. Norris, L., C. Olschanowsky, P. C. Roth, M., S. Shende, A. Snavely, Spear, M. Tikir, J. Vetter, P. Worley, N. Wright, "Modeling the Office of Science ten year facilities plan: The PERI Architecture Tiger Team", Journal of Physics: Conference Series, 2009, 180:012039,

Tina M. Declerck

2016

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

2014

Zhengji Zhao, Doug Petesch, David Knaak, and Tina Declerck, "I/O Performance on Cray XC30", Cray User Group Meeting, May 4, 2014,

2013

Richard A. Gerber, Tina Declerck. Zhengji Zhao, Edison Update, February 12, 2013,

Overview and update on the installation and configuration of Edison, NERSC's new Cray XC30 supercomputer.

2011

Lavanya Ramakrishnan, Piotr T. Zbiegel, Scott Campbell, Rick Bradshaw, Richard Shane Canon, Susan Coghlan, Iwona Sakrejda, Narayan Desai, Tina Declerck, Anping Liu, "Magellan: Experiences from a Science Cloud", Proceedings of the 2nd International Workshop on Scientific Cloud Computing, ACM ScienceCloud '11, Boulder, Colorado, and New York, NY, 2011, 49 - 58,

Jack Deslippe

2016

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

R Gerber, J Deslippe, D Doerfler, Many Cores for the Masses: Lessons Learned from Application Readiness Efforts at NERSC for the Knights Landing based Cori System, Intel HPC Developers Conference, November 12, 2016,

Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, Volume 9945, October 6, 2016, doi: 10.1007/978-3-319-46079-6_24

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, OpenMP Parallelization and Optimization of Graph-based Machine Learning Algorithms, IWOMP 2016, October 6, 2016,

Tareq Malas, Thorsten Kurth, Jack Deslippe, "Optimization of the sparse matrix-vector products of an IDR Krylov iterative solver in EMGeo for the Intel KNL manycore processor", Springer Lecture Notes in Computer Science, October 6, 2016,

Jack, Deslippe, Felipe H. da Jornada, Derek Vigil-Fowler, Taylor Barnes, Nathan Wichmann, Karthik Raman, Ruchira Sasanka, Steven G. Louie, "Optimizing Excited-State Electronic-Structure Codes for Intel Knights Landing: A Case Study on the BerkeleyGW Software.", Springer Lecture Notes in Computer Science (ISC 2016), Springer International Publishing, October 6, 2016, 402,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, MPI usage at NERSC: Present and Future, EuroMPI 2016, September 26, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, "MPI usage at NERSC: Present and Future", EuroMPI 2016, Edinburgh, Scotland, UK, September 26, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, "MPI usage at NERSC: Present and Future", EuroMPI 2016, September 26, 2016,

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, "OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms", Lecture Notes in Computer Science, Springer, 2016, 9903:17-31, doi: 10.1007/978-3-319-45550-1_2

MeiYue Shao, Lin Lin, Chao Yang, Fang Liu, Felipe H. Da Jornada, Jack Deslippe, Steven G. Louie, "Low rank approximation in G0W0 calculations.", Science China Mathematics, August 1, 2016,

Dilworth Y. Parkinson, Keith Beattie, Xian Chen, Joaquin Correa, Eli Dart, Benedikt J. Daurer, Jack R. Deslippe, Alexander Hexemer, Harinarayan Krishnan, Alastair A. MacDowell, Filipe R. N. C. Maia, Stefano Marchesini, Howard A. Padmore, Simon J. Patton, Talita Perciano, James A. Sethian, David Shapiro, Rune Stromsness, Nobumichi Tamura, Brian L. Tierney, Craig E. Tull, Daniela Ushizima, "Real-time data-intensive computing.", AIP Conference Proceedings, July 2016, 1741,

SV Venkatakrishnan, K Aditya Mohan, Keith Beattie, Joaquin Correa, Eli Dart, Jack R Deslippe, Alexander Hexemer, Harinarayan Krishnan, Alastair A MacDowell, Stefano Marchesini, Simon J Patton, Talita Perciano, James A Sethian, Rune Stromsness, Brian L Tierney, Craig E Tull, Daniela Ushizima, Dilworth Y Parkinson, "Making Advanced Scientific Algorithms and Big Scientific Data Management More Accessible", Electronic Imaging, February 14, 2016, 2016 Is.:1,

Meiyue Shao, H Felipe, Chao Yang, Jack Deslippe, Steven G Louie, "Structure preserving parallel algorithms for solving the Bethe–Salpeter eigenvalue problem", Linear Algebra and its Applications, January 1, 2016, 488:148,

2015

Michiel J van Setten, Fabio Caruso, Sahar Sharifzadeh, Xinguo Ren, Matthias Scheffler, Fang Liu, Johannes Lischner, Lin Lin, Jack R Deslippe, Steven G Louie, Chao Yang, Florian Weigend, Jeffrey B Neaton, Ferdinand Evers, Patrick Rinke, "GW 100: Benchmarking G 0 W 0 for molecular systems", Journal of chemical theory and computation, October 22, 2015, 11:5665,

Fang Liu, Lin Lin, Derek Vigil-Fowler, Johannes Lischner, Alexander F. Kemper, Sahar Sharifzadeh, Felipe Homrich da Jornada, Jack Deslippe, Chao Yang, Jeffrey B. Neaton, Steven G. Louie, "Numerical integration for ab initio many-electron self energy calculations within the GW approximation.", Journal of Computational Physics, April 1, 2015,

Jack Deslippe, Brian Austin, Chris Daley, Woo-Sun Yang, "Lessons learned from optimizing science kernels for Intel's "Knights-Corner" architecture", CISE, April 1, 2015,

2014

Jack Deslippe, Abdelilah Essiari, Simon J. Patton, Taghrid Samak, Craig E. Tull, Alexander Hexemer, Dinesh Kumar, Dilworth Parkinson, Polite Stewart., "Workflow management for real-time analysis of lightsource experiments", Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science (SC14), November 16, 2014, 31-40,

Manish Jain, Jack Deslippe, Georgy Samsonidze, M.L. Cohen, J.R. Chelikowsky, S.G. Louie, "Improved quasiparticle wavefunctions and mean field for G0W0 calculations: Diagonalization of the static-COHSEX operator", Physical Review B, September 26, 2014,

Johannes Lischner, Sahar Sharifzadeh, Jack Deslippe, J. Neaton, and S. G. Louie, "Effects of Self-consistency and Plasmon-pole Models on GW Calculations for Closed-shell Molecules", Physical Review B, September 17, 2014,

Justin Blair, Richard S. Canon, Jack Deslippe, Abdelilah Essiari, Alexander Hexemer, Alastair A. MacDowell, Dilworth Y. Parkinson, Simon J. Patton, Lavanya Ramakrishnan, Nobumichi Tamura, Brian L. Tierney, Craig E. Tull, "High performance data management and analysis for tomography", Proc. SPIE 9212, Developments in X-Ray Tomography IX, September 12, 2014,

Kin Fai Mak, Felipe H. da Jornada, Keliang He, Jack Deslippe, Nicholas Petrone, James Hone, Jie Shan, Steven G. Louie, and Tony F. Heinz, "Tuning Many-Body Interactions in Graphene: The Effects of Doping on Excitons and Carrier Lifetimes", Physical Review Letters, May 20, 2014, 112:207401,

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

Jack Deslippe, NERSC, Preparing Applications for Future NERSC Architectures, February 6, 2014,

2013

Jack Deslippe, Building Applications on Edison, October 10, 2013,

Jack Deslippe, Georgy Samsonidze, Manish Jain, Marvin L Cohen, Steven G Louie, "Coulomb-hole summations and energies for GW calculations with limited number of empty orbitals: a modified static remainder approach", Physical Review B (arXiv preprint arXiv:1208.0266), 2013,

Jack Deslippe, Zhengji Zhao, "Comparing Compiler and Library Performance in Material Science Applications on Edison", Paper. Proceedings of the Cray User Group 2013, 2013,

2012

Megan Bowling, Zhengji Zhao and Jack Deslippe, "The Effects of Compiler Optimizations on Materials Science and Chemistry Applications at NERSC", A paper presented in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,

Megan Bowling, Zhengji Zhao and Jack Deslippe, The Effects of Compiler Optimizations on Materials Science and Chemistry Applications at NERSC, A talk in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,

Jack Deslippe, Georgy Samsonidze, David Strubbe, Manish Jain, Marvin L. Cohen, Steven G. Louie, "BerkeleyGW: A Massively Parallel Computer Package for the Calculation of the Quasiparticle and Optical Properties of Materials", Comput. Phys. Comm., 2012,

Johannes Lischner, Jack Deslippe, Manish Jain, Steven G Louie, "First-Principles Calculations of Quasiparticle Excitations of Open-Shell Condensed Matter Systems", Physical Review Letters, 2012, 109:36406,

Kaihui Liu, Jack Deslippe, Fajun Xiao, Rodrigo B Capaz, Xiaoping Hong, Shaul Aloni, Alex Zettl, Wenlong Wang, Xuedong Bai, Steven G Louie, others, "An atlas of carbon nanotube optical transitions", Nature Nanotechnology, 2012, 7:325--329,

2011

David A Siegel, Cheol-Hwan Park, Choongyu Hwang, Jack Deslippe, Alexei V Fedorov, Steven G Louie, Alessandra Lanzara, "Many-body interactions in quasi-freestanding graphene", Proceedings of the National Academy of Sciences, 2011, 108:11365--113,

Georgy Samsonidze, Manish Jain, Jack Deslippe, Marvin L Cohen, Steven G Louie, "Simple Approximate Physical Orbitals for GW Quasiparticle Calculations", Physical Review Letters, 2011, 107:186404,

Jack Deslippe, Manish Jain, Georgy Samsonidze, Marvin Cohen, Steven Louie, The sc-COHSEX+ GW and the static off-diagonal GW approaches to quasiparticle wavefunctions and energies, Bulletin of the American Physical Society, 2011,

Jack Deslippe, S.G. Louie, "Ab initio Theories of the Structural, Electronic, and Optical Properties of Semiconductors: Bulk Systems to Nanostructures.", Comprehensive Semiconductor Science and Technology., (Elsevier: 2011) Pages: 42-76

2010

Jack Deslippe, Cheol-Hwan Park, Manish Jain, Steven Louie, First-principles Calculations of the Quasiparticle and Optical Excitations in Metallic Carbon Nanostructures, Bulletin of the American Physical Society, 2010,

2009

Li Yang, Jack Deslippe, Cheol-Hwan Park, Marvin L Cohen, Steven G Louie, "Excitonic effects on the optical response of graphene and bilayer graphene", Physical review letters, 2009, 103:186802,

Jack Deslippe, Mario Dipoppa, David Prendergast, Marcus VO Moutinho, Rodrigo B Capaz, Steven G Louie, "Electron-Hole Interaction in Carbon Nanotubes: Novel Screening and Exciton Excitation Spectra", Nano Lett, 2009, 9:1330--1334,

Jack Deslippe, David Prendergast, Steven Louie, Nonlinear Optical Properties of Carbon Nanotubes from First Principles, Bulletin of the American Physical Society, 2009,

2008

Jack Deslippe, Steven G Louie, "Excitons and many-electron effects in the optical response of carbon nanotubes and other one-dimensional nanostructures", Proceedings of SPIE, the International Society for Optical Engineering, 2008, 68920U--1,

Jack Deslippe, Mario Dipoppa, David Prendergast, Rodrigo Capaz, Steven Louie, Effective One-Dimensional Electron-Hole Interaction in Single-Walled Carbon Nanotubes, Bulletin of the American Physical Society, 2008,

2007

Jack Deslippe, Catalin D Spataru, David Prendergast, Steven G Louie, "Bound excitons in metallic single-walled carbon nanotubes", Nano letters, 2007, 7:1626--1630,

Feng Wang, David J Cho, Brian Kessler, Jack Deslippe, P James Schuck, Steven G Louie, Alex Zettl, Tony F Heinz, Y Ron Shen, "Observation of excitons in one-dimensional metallic single-walled carbon nanotubes", Physical review letters, 2007, 99:227401,

Jack Deslippe, David Prendergast, Steven Louie, Electron Self-Energy Corrections to Quasiparticle Excitations in Graphene and Large Diameter Single-Walled Carbon Nanotubes, Bulletin of the American Physical Society, 2007,

2006

Jack Deslippe, Catalin Spataru, Steven Louie, Bound excitons and optical absorption spectra of (10, 10) metallic single-walled carbon nanotubes, Bulletin of the American Physical Society, 2006,

2004

Jack Deslippe, R Tedstrom, Murray S Daw, D Chrzan, T Neeraj, M Mills, "Dynamic scaling in a simple one-dimensional model of dislocation activity", Philosophical Magazine, 2004, 84:2445--2454,

2003

Jianjun Dong, Jack Deslippe, Otto F Sankey, Emmanuel Soignard, Paul F McMillan, "Theoretical study of the ternary spinel nitride system Si 3 N 4-Ge 3 N 4", Physical Review B, 2003, 67:094104,

Deslippe Jack, Jianjun Dong, First principles calculations of thermodynamical properties of cage-like silicon clathrate materials, APS Meeting Abstracts, Pages: 25002 2003,

Douglas Doerfler

2016

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

R Gerber, J Deslippe, D Doerfler, Many Cores for the Masses: Lessons Learned from Application Readiness Efforts at NERSC for the Knights Landing based Cori System, Intel HPC Developers Conference, November 12, 2016,

Carleton DeTar, Douglas Doerfler, Steven Gottlieb, Ashish Jha, Dhiraj Kalamkar, Ruizi Li, Doug Toussaint, "MILC staggered conjugate gradient performance on Intel KNL", 34th International Symposium on Lattice Field Theory (Lattice 2016), Southampton, UK, November 3, 2016,

Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, Volume 9945, October 6, 2016, doi: 10.1007/978-3-319-46079-6_24

Robert Leland, Mahesh Rajan, Michael A. Heroux, Douglas W. Doerfler, "Performance, Efficiency, and Effectiveness of Supercomputers", Sandia National Laboratories, Sandia Report SAND2016-3730, September 2016,

R Li, C DeTar, D Doerfler, S Gottlieb, A Jha, D Kalamakar, D Toussaint, Porting the MIMD Lattice Computation (MILC) Code to the Intel Xeon Phi Knights Landing Processor, ISC High Performance 2016 International Workshops: Application Performance on Intel Xeon Phi – Being Prepared for KNL & Beyond, June 23, 2016,

2015

D Doerfler, Understanding Application Data Movement Characteristics using Intel’s VTune Amplifier and Software Development Emulator Tools, Intel Xeon Phi Users Group (IXPUG) 2015, Annual Meeting, September 30, 2015,

Mahesh Rajan, Doug Doerfler, Mike Tupek, Si Hammond, "An Investigation of Compiler Vectorization on Current and Next-generation Intel Processors using Benchmarks and Sandia’s SIERRA Applications", Cray User Group (CUG) 2015, April 2015,

Richard F. Barrett, Paul Crozier, Douglas W. Doerfler, Michael A. Heroux, Paul Lin, Heidi K. Thornquist, Timothy G. Trucano, Courtenay T. Vaughan, "Assessing the Role of Mini-Applications in Predicting Key Performance Characteristics of Scientific and Engineering Applications", Journal of Parallel and Distributed Computing, Volume 75, Pages 107-122, January 2015,

2014

Douglas Doerfler, First Experiences with 64-bit ARM Moonshot, HP-CAST 23, November 2014,

M. J. Cordery, B. Austin, H. J. Wasserman, C. S. Daley, N. J. Wright, S. D. Hammond, D. Doerfler, "Analysis of Cray XC30 Performance using Trinity-NERSC-8 benchmarks and comparison with Cray XE6 and IBM BG/Q", High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation (PMBS 2013). Lecture Notes in Computer Science, Volume 8551, October 1, 2014,

Douglas Doerfler, Dr. Tom Bradicich, An Evaluation of 64-bit ARM for use in High-Performance Modeling and Simulation Architecture, ARM TechCon 2014, October 2014,

Mahesh Rajan, Douglas W. Doerfler, Richard Frederick Barrett, Joel O. Stevenson, Anthony Michael Agelastos, Ryan Phillip Shaw, Harold Edward Meyer, "Experiences with Sandia National Laboratories HPC applications and MPI Performance", MVAPICH Users Group Meeting, August 2014,

Douglas Doerfler, The Role of Advanced Technology Systems in the ASC Platform Strategy, Salishan Conference on High-Speed Computing, April 2014,

Douglas Doerfler, Trinity: Next-Generation Supercomputer for the ASC Program, HPC User Forum, April 1, 2014,

S.S. Dosanjh, R.F. Barrett, D.W. Doerfler, S.D. Hammond, K.S. Hemmert, M.A. Heroux, P.T. Lin, K.T. Pedretti, A.F. Rodrigues, T.G. Trucano, J.P. Juitjens, "Exascale Design Space Exploration and Co-Design", Future Generation Computer Systems, Volume 30, Pages 46-58, January 2014,

2013

Michael A. Heroux, Richard Frederick Barrett, James Michael Willenbring, Daniel W Barnette, David Beckingsale, James F Belak, Mike Boulton, Paul Crozier, Douglas W. Doerfler, Harold C. Edwards, Wayne Gaudin, Timothy C Germann, Simon David Hammond, Andy Herdman, Stephen Jarvis, Paul Lin, Justin Luitjens, Andrew Mallinson, Simon McIntosh-Smith, Susan M Mniszewski, Jamaludin Mohd-Yusof, David F Richards, Christopher Sewell, Sriram Swaminarayan, Heidi K. Thornquist, Christian Robert Trott, Courtenay T. Vaughan, Alan B. Williams, R&D 100 Award, Mantevo Suite 1.0, R&D Magazine, August 2013,

2012

Richard F. Barrett, Simon D. Hammond, Courtenay T. Vaughan, Doug W. Doerfler, Michael A. Heroux, Justin P. Luitjens, Duncan Roweth, "Navigating An Evolutionary Fast Path to Exascale", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12), November 2012,

Mahesh Rajan, Douglas W. Doerfler, Paul T. Lin, Simon D. Hammond, Richard F. Barrett, Courtney T. Vaughan, "Unprecedented Scalability and Performance of the New NNSA Tri-Lab Linux Capacity Cluster 2", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12), November 2012,

Richard Barrett, Paul Crozier, Doug Doerfler, Simon Hammond, Mike Heroux, Paul Lin, Tim Trucano, Courtenay Vaughan, Alan Williams, "Assessing the Predictive Capabilities of Mini-applications", The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), November 2012,

Mahesh Rajan, Courtenay T. Vaughan, Doug W. Doerfler, Richard F. Barrett, Kevin T. Pedretti, Karl S. Hemmert, "Application-driven Analysis of Two Generations of Capability Computing Platforms: Purple and Cielo", Computation and Concurrency: Practice and Experience, Volume 24, Issue 18, March 2012,

2011

Kevin Pedretti, Ron Brightwell, Doug Doerfler, K. Scott Hemmert, James H. Laros, III, "The Impact of Injection Bandwidth Performance on Application Scalability", EuroMPI 2011. Lecture Notes in Computer Science, Volume 6960, September 2011,

Douglas W. Doerfler, ASC Salutes, National Nuclear Security Administration Advanced Simulation & Computing Program Office, June 2011,

D. Doerfler, S. Dosanjh, J. Morrison, M. Vigil, "Production Petascale Computing", Cray Users Group Meeting, Fairbanks, Alaska, 2011,

Courtenay T. Vaughan, Mahesh Rajan, Douglas W. Doerfler, Richard F. Barrett, Kevin Pedretti, "Investigating the Impact of the Cielo Cray XE6 Architecture on Scientific Application Codes", IPDPS 2011 International Workshop on Large-Scale Parallel Processing (LSPP'11), May 2011,

Douglas Doerfler, Mahesh Rajan, Cindy Nuss, Cornell Wright, Tom Spelce, "Application-Driven Acceptance of Cielo, an XE6 Petascale Capability Platform", Cray User Group (CUG) 2011, May 2011,

Manuel Vigil, Douglas Doerfler, Sudip Dosanjh, John Morrison, 2010 Defense Programs Award of Excellence for Significant Contributions to the Stockpile Stewardship Program, Successful Deployment of Cielo Petascale Supercomputer, National Nuclear Security Administration, April 2011,

2010

Richard F Barrett, Courtenay T Vaughan, Mahesh Rajan, Douglas W Doerfler, "From Red Storm to Cielo: Performance Analysis of ASC Simulation Programs Across an Evolution of Multicore Architectures", The ACM/IEEE Conference on High Performance Networking and Computing (SC10), November 2010,

Mahesh Rajan, Douglas Doerfler, "HPC application performance and scaling: understanding trends and future challenges with application benchmarks on past, present and future Tri-Lab computing systems", 8th International Conference of Numerical Analysis and Applied Mathematics, September 17, 2010,

J. Ang, D. Doerfler, S. Dosanjh, K. Koch, J. Morrison, M. Vigil, "The Alliance for Computing at the Extreme Scale", Proceedings of the Cray Users Group Meeting, Edinburgh, Scotland, May 24, 2010,

Courtenay Vaughan, Douglas Doerfler, "Analyzing Multicore Characteristics for a Suite of Applications on an XT5 System", Cray User Group (CUG) 2010, May 2010,

Mahesh Rajan, Douglas Doerfler, Courtenay T. Vaughan, Marcus Epperson, Jeff Ogden, "Application Performance on the Tri-Lab Linux Capacity Cluster - TLCC", International Journal of Distributed Systems and Technologies, Volume 1, Issue 2, April 2010,

2009

Mahesh Rajan, Douglas W Doerfler, Courtenay T Vaughan, "Red Storm/Cray XT4: A Superior Architecture for Scalability", Cray User Group (CUG) 2009, May 2009,

Brian J. Martin, Andrew J. Leiker, James H. Laros, III, Douglas W. Doerfler, "Performance Analysis of the SiCortex SC092", The 10th LCI International Conference on High-Performance Clustered Computing, March 2009,

Douglas W. Doerfler, Analyzing the Application Performance Impact of Using High-Speed Inter-Socket Communication Networks, Workshop on The Influence of I/O on Microprocessor Architecture (IOM-2009), February 2009,

2008

Mahesh Rajan, Courtenay T Vaughan, Robert W Leland, Douglas W Doerfler, Robert E Benner, Jr., "Investigating the balance between capacity and capability workloads across large scale computing platforms", 9th LCI International Conference on High-Performance Computing, April 2008,

2007

Douglas Doerfler, David Hensinger, Brent Leback, Douglas Miles, "Tuning C++ Applications for the Latest Generation x64 Processors with PGI Compilers and Tools", Cray User Group (CUG) 2007, May 2007,

2006

William J. Camp, Robert A. Ballance, Linda R. Bonnefoy-Lev, Ronald B. Brightwell, Douglas W. Doerfler, James L. Handrock, Karen L. Jefferson, Suzanne M. Kelly, James H. Laros III, Robert W. Leland, Michael J. Levenhagen, John J. Naegle, John P. Noe, Kevin T. Pedretti, Mahesh Rajan, Leonard Stands, Judy E. Sturtevant, James L. Tomkins, Keith D. Underwood, John P. Van Dyke, Courtenay T. Vaughan, H. Lee Ward, David R. White, John D. Zepper, Lockheed Martin Nova Award, Red Storm Supercomputer Design and Development Team, Lockheed Martin Corporation, October 2006,

Ron Brightwell, Douglas Doerfler, "Measuring MPI Send and Receive Overhead and Application Availability in High Performance Network Interfaces", EuroPVM/MPI 2006. Lecture Notes in Computer Science. Volume 4192, September 2006,

Ron Brightwell, Douglas Doerfler, Keith D Underwood, "A Preliminary Analysis of the InfiniPath and XD1 Network Interfaces", Proceedings 20th IEEE International Parallel & Distributed Processing Symposium: Workshop on Communication Architecture for Clusters, April 2006,

2005

Douglas W. Doerfler, Courtenay T. Vaughan, "Characterizing Compiler Performance for the AMD Opteron Processor on a Parallel Platform", Cray User Group (CUG) 2005, May 2005,

2004

Ron B. Brightwell, Douglas W. Doerfler, Keith D. Underwood, "A Comparison of 4X Infiniband and Quadrics Elan-4 Technologies", 2004 IEEE International Conference on Cluster Computing (Cluster 2004), September 2004,

Sudip Dosanjh

2016

C.S. Daley, D. Ghoshal, G.K. Lockwood, S. Dosanjh, L. Ramakrishnan, N.J. Wright, "Performance Characterization of Scientific Workflows for the Optimal Use of Burst Buffers", Workflows in Support of Large-Scale Science (WORKS-2016), CEUR-WS.org, 2016, 1800:69-73,

2015

C.S. Daley, L. Ramakrishnan, S. Dosanjh, N.J. Wright, "Analyses of Scientific Workflows for Effective Use of Future Architectures", The 6th International Workshop on Big Data Analytics: Challenges, and Opportunities (BDAC-15), 2015,

N.J. Wright, S. S. Dosanjh, A. K. Andrews, K. Antypas, B. Draney, R.S Canon, S. Cholia, C.S. Daley, K. M. Fagnan, R.A. Gerber, L. Gerhardt, L. Pezzaglia, Prabhat, K.H. Schafer, J. Srinivasan, "Cori: A Pre-Exascale Computer for Big Data and HPC Applications", Big Data and High Performance Computing 26 (2015): 82., ( June 2015) doi: 10.3233/978-1-61499-583-8-82

Extreme data science is becoming increasingly important at the U.S. Department of Energy's National Energy Research Scientific Computing Center (NERSC). Many petabytes of data are transferred from experimental facilities to NERSC each year. Applications of importance include high-energy physics, materials science, genomics, and climate modeling, with an increasing emphasis on large-scale simulations and data analysis. In response to the emerging data-intensive workloads of its users, NERSC made a number of critical design choices to enhance the usability of its pre-exascale supercomputer, Cori, which is scheduled to be delivered in 2016. These data enhancements include a data partition, a layer of NVRAM for accelerating I/O, user defined images and a customizable gateway for accelerating connections to remote experimental facilities.

2014

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

S.S. Dosanjh, R.F. Barrett, D.W. Doerfler, S.D. Hammond, K.S. Hemmert, M.A. Heroux, P.T. Lin, K.T. Pedretti, A.F. Rodrigues, T.G. Trucano, J.P. Juitjens, "Exascale Design Space Exploration and Co-Design", Future Generation Computer Systems, Volume 30, Pages 46-58, January 2014,

2013

Richard A. Barrett, Shekhar Borkar, Sudip S. Dosanjh, Simon D. Hammond, Michael A. Heroux, X. Sharon Hu, Justin Luitjens, Steven G. Parker, John Shalf, Li Tang, "On the Role of Co-design in High Performance Computing", Transition of HPC Towards Exascale Computing, E.H. D'Hollander et. al (Eds.), IOS Press, 2013, ( November 1, 2013) doi: 10.3233/978-1-61499-324-7-141

Rolf Riesen, Sudip Dosanjh, Larry Kaplan, "The ExaChallenge Symposium", IBM Research Paper, August 26, 2013,

2012

R. Barrett, S. Dosanjh, et al., "Towards Codesign in High Performance Computing Systems", IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, November 5, 2012,

2011

R. Sevens, A. White, S. Dosanjh, et al., "Scientific Grand Challenges: Architectures and Technology for Extreme-Scale Computing Report", 2011,

D. Doerfler, S. Dosanjh, J. Morrison, M. Vigil, "Production Petascale Computing", Cray Users Group Meeting, Fairbanks, Alaska, 2011,

Manuel Vigil, Douglas Doerfler, Sudip Dosanjh, John Morrison, 2010 Defense Programs Award of Excellence for Significant Contributions to the Stockpile Stewardship Program, Successful Deployment of Cielo Petascale Supercomputer, National Nuclear Security Administration, April 2011,

J. Ang, R. Brightwell, S. Dosanjh, et al., "Exascale Computing and the Role of Co-Design", ( 2011)

J. Dongarra et al., "The International Exascale Software Project Roadmap", International Journal of High Performance Computing Applications, 25:1, 2011,

2010

R. Leland and S. Dosanjh, "Computing at Exascale: A Value Proposition", Sandia National Laboratories Report, November 16, 2010,

S. Hu, R. Murphy, S. Dosanjh, K. Olukoton, S. Poole, "Hardware/Software Co- Design for High Performance Computing", Proceedings of CODES+ISSS’10, October 24, 2010,

A. Rodrigues, S. Dosanjh, S. Hemmert, "Co-Design for High Performance Computing", Proceedings of the International Conference on Numerical Analysis and Applied Mathematics, Rhodes, Greece, September 18, 2010,

J. Ang, D. Doerfler, S. Dosanjh, K. Koch, J. Morrison, M. Vigil, "The Alliance for Computing at the Extreme Scale", Proceedings of the Cray Users Group Meeting, Edinburgh, Scotland, May 24, 2010,

K. Alvin, B. Barrett, R. Brightwell, S. Dosanjh, A. Geist, S. Hemmert, M. Heroux, D. Kothe, R. Murphy, J. Nichols, R. Oldfield, A. Rodrigues, J. Vetter, "On the Path to Exascale", International Journal of Distributed Systems and Technologies, 1(2):1– 22, May 22, 2010,

J. Tomkins, R. Brightwell, W. Camp, S. Dosanjh, S. Kelly, P. Lin, C. Vaughan, J. Levesque, V. Tipparaju, "The Red Storm Architecture and Early Experiences with Multi-Core Processors", International Journal of Distributed Systems and Technologies, Vol. 1, Issue 2, pp. 74-93, April 19, 2010, doi: 10.4018/jdst.2010040105

John Shalf, S. Dosanjh, John Morrison, "Exascale Computing Technology Challenges", VECPAR, ( 2010) Pages: 1-25

High Performance Computing architectures are expected to change dramatically in the next decade as power and cooling constraints limit increases in microprocessor clock speeds. Consequently computer companies are dramatically increasing on-chip parallelism to improve performance. The traditional doubling of clock speeds every 18-24 months is being replaced by a doubling of cores or other parallelism mechanisms. During the next decade the amount of parallelism on a single microprocessor will rival the number of nodes in early massively parallel supercomputers that were built in the 1980s. Applications and algorithms will need to change and adapt as node architectures evolve. In particular, they will need to manage locality to achieve performance. A key element of the strategy as we move forward is the co-design of applications, architectures and programming environments. There is an unprecedented opportunity for application and algorithm developers to influence the direction of future architectures so that they meet DOE mission needs. This article will describe the technology challenges on the road to exascale, their underlying causes, and their effect on the future of HPC system design.

2009

A. Geist, S. Dosanjh, "IESP Exascale Challenge: Co-Design of Architectures and Algorithms", International Journal of High Performance Computing, Vol. 23, No. 4, pp. 401–402, September 18, 2009,

Scott W. French

2015

Scott French, Yili Zheng, Barbara Romanowicz, Katherine Yelick, "Parallel Hessian Assembly for Seismic Waveform Inversion Using Global Updates", IEEE International Parallel & Distributed Processing Symposium (IPDPS) 2015, May 25, 2015, doi: 10.1109/IPDPS.2015.58

2014

Huaiyu Yuan, Scott French, Paul Cupillard, Barbara Romanowicz, "Lithospheric expression of geological units in central and eastern North America from full waveform tomography", Earth and Planetary Science Letters, 2014, 402:176, doi: 10.1016/j.epsl.2013.11.057

Scott French, Barbara Romanowicz, "Whole-mantle radially anisotropic shear velocity structure from spectral-element waveform tomography", Geophysical Journal International, 2014, doi: 10.1093/gji/ggu334

2013

Scott French, Vedran Lekic, Barbara Romanowicz, "Waveform Tomography Reveals Channeled Flow at the Base of the Oceanic Asthenosphere", Science, 2013, 342:227, doi: 10.1126/science.1241514

2011

Vedran Lekic, Scott French, Barbara Romanowicz, "Lithospheric Thinning Beneath Rifted Regions of Southern California", Science, 2011, 334:783, doi: 10.1126/science.1208898

2010

David Abt, Karen Fischer, Scott French, Heather Ford, Huaiyu Yuan, Barbara Romanowicz, "North American lithospheric discontinuity structure imaged by Ps and Sp receiver functions", Journal of Geophysical Research, 2010, 115, doi: 10.1029/2009JB006914

Scott French, Linda Warren, Karen Fischer, Geoffrey Abers, Wilfried Strauch, J. Marino Protti, Victor Gonzalez, "Constraints on upper plate deformation in the Nicaraguan subduction zone from earthquake relocation and directivity analysis", Geochemistry, Geophysics, Geosystems, 2010, 11, doi: 10.1029/2009GC002841

2009

Scott French, Karen Fischer, Ellen Syracuse, Michael Wysession, "Crustal structure beneath the Florida-to-Edmonton broadband seismometer array", Geophysical Research Letters, 2009, 35, doi: 10.1029/2008GL036331

Brian Friesen

2016

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

Friesen, B., Almgren, A., Lukić, Z., Weber, G., Morozov, D., Day, M., "In situ and in-transit analysis of cosmological simulations", Computational Astrophysics and Cosmology, edited by Simon Portegies Zwart, August 26, 2016, 3:1-18, LBNL LBNL-1006104, doi: 10.1186/s40668-016-0017-2

Modern cosmological simulations have reached the trillion-element scale, rendering data storage and subsequent analysis formidable tasks. To address this circumstance, we present a new MPI-parallel approach for analysis of simulation data while the simulation runs, as an alternative to the traditional workflow consisting of periodically saving large data sets to disk for subsequent `offline' analysis. We demonstrate this approach in the compressible gasdynamics/N-body code Nyx, a hybrid MPI+OpenMP code based on the BoxLib framework, used for large-scale cosmological simulations. We have enabled on-the-fly workflows in two different ways: one is a straightforward approach consisting of all MPI processes periodically halting the main simulation and analyzing each component of data that they own ('in situ'). The other consists of partitioning processes into disjoint MPI groups, with one performing the simulation and periodically sending data to the other 'sidecar' group, which post-processes it while the simulation continues ('in-transit'). The two groups execute their tasks asynchronously, stopping only to synchronize when a new set of simulation data needs to be analyzed. For both the in situ and in-transit approaches, we experiment with two different analysis suites with distinct performance behavior: one which finds dark matter halos in the simulation using merge trees to calculate the mass contained within iso-density contours, and another which calculates probability distribution functions and power spectra of various fields in the simulation. Both are common analysis tasks for cosmology, and both result in summary statistics significantly smaller than the original data set. We study the behavior of each type of analysis in each workflow in order to determine the optimal configuration for the different data analysis algorithms.

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Parrent, J. T., Howell, D. A., Fesen, R. A., Parker, S., Bianco, F. B., Dilday, B., Sand, D., Valenti, S., Vinkó, J., Berlind, P., Challis, P., Milisavljevic, D., Sanders, N., Marion, G. H., Wheeler, J. C., Brown, P., Calkins, M. L., Friesen, B., Kirshner, R., Pritchard, T., Quimby, R., Roming, P., "Comparative analysis of SN 2012dn optical spectra: days -14 to +114", Monthly Notices of the Royal Astronomical Society, January 29, 2016, 457:3702-3723, doi: 10.1093/mnras/stw239

SN 2012dn is a super-Chandrasekhar mass candidate in a purportedly normal spiral (SAcd) galaxy, and poses a challenge for theories of type Ia supernova diversity. Here we utilize the fast and highly parametrized spectrum synthesis tool, SYNAPPS, to estimate relative expansion velocities of species inferred from optical spectra obtained with six facilities. As with previous studies of normal SN Ia, we find that both unburned carbon and intermediate-mass elements are spatially coincident within the ejecta near and below 14 000 km s−1. Although the upper limit on SN 2012dn's peak luminosity is comparable to some of the most luminous normal SN Ia, we find a progenitor mass exceeding ∼1.6 M is not strongly favoured by leading merger models since these models do not accurately predict spectroscopic observations of SN 2012dn and more normal events. In addition, a comparison of light curves and host-galaxy masses for a sample of literature and Palomar Transient Factory SN Ia reveals a diverse distribution of SN Ia subtypes where carbon-rich material remains unburned in some instances. Such events include SN 1991T, 1997br, and 1999aa where trace signatures of C III at optical wavelengths are presumably detected.

2015

Baron, E., Hoeflich, P., Friesen, B., Sullivan, M., Hsiao, E., Ellis, R. S., Gal-Yam, A., Howell, D. A., Nugent, P. E., Dominguez, I., Krisciunas, K., Phillips, M. M., Suntzeff, N., Wang, L., and Thomas, R. C., "Spectral models for early time SN 2011fe observations", Monthly Notices of the Royal Astronomical Society, 2015, 454:2549, doi: 10.1093/mnras/stv1951

We use observed UV through near-IR spectra to examine whether SN 2011fe can be understood in the framework of Branch-normal Type Ia supernovae (SNe Ia) and to examine its individual peculiarities. As a benchmark, we use a delayed-detonation model with a progenitor metallicity of Z/20. We study the sensitivity of features to variations in progenitor metallicity, the outer density profile, and the distribution of radioactive nickel. The effect of metallicity variations in the progenitor have a relatively small effect on the synthetic spectra. We also find that the abundance stratification of SN 2011fe resembles closely that of a delayed-detonation model with a transition density that has been fit to other Branch-normal SNe Ia. At early times, the model photosphere is formed in material with velocities that are too high, indicating that the photosphere recedes too slowly or that SN 2011fe has a lower specific energy in the outer ≈0.1 M than does the model. We discuss several explanations for the discrepancies. Finally, we examine variations in both the spectral energy distribution and in the colours due to variations in the progenitor metallicity, which suggests that colours are only weak indicators for the progenitor metallicity, in the particular explosion model that we have studied. We do find that the flux in the U band is significantly higher at maximum light in the solar metallicity model than in the lower metallicity model and the lower metallicity model much better matches the observed spectrum.

2014

Friesen, B., Baron, E., Wisniewski, J. P., Parrent, J. T., Thomas, R. C., Miller, Timothy R., and Marion, G. H., "Near-infrared Line Identification in Type Ia Supernovae during the Transitional Phase", The Astrophysical Journal, 2014, 792:120, doi: 10.1088/0004-637X/792/2/120

We present near-infrared synthetic spectra of a delayed-detonation hydrodynamical model and compare them to observed spectra of four normal Type Ia supernovae ranging from day +56.5 to day +85. This is the epoch during which supernovae are believed to be undergoing the transition from the photospheric phase, where spectra are characterized by line scattering above an optically thick photosphere, to the nebular phase, where spectra consist of optically thin emission from forbidden lines. We find that most spectral features in the near-infrared can be accounted for by permitted lines of Fe II and Co II. In addition, we find that [Ni II] fits the emission feature near 1.98 μm, suggesting that a substantial mass of 58Ni exists near the center of the ejecta in these objects, arising from nuclear burning at high density.

Parrent, J. T., Friesen, B., Parthasarathy, M., "A Review of Type Ia Supernova Spectra", Astrophysics and Space Science, 2014, 351:1-52, doi: 10.1007/s10509-014-1830-1

SN 2011fe was the nearest and best-observed type Ia supernova in a generation, and brought previous incomplete datasets into sharp contrast with the detailed new data. In retrospect, documenting spectroscopic behaviors of type Ia supernovae has been more often limited by sparse and incomplete temporal sampling than by consequences of signal-to-noise ratios, telluric features, or small sample sizes. As a result, type Ia supernovae have been primarily studied insofar as parameters discretized by relative epochs and incomplete temporal snapshots near maximum light. Here we discuss a necessary next step toward consistently modeling and directly measuring spectroscopic observables of type Ia supernova spectra. In addition, we analyze current spectroscopic data in the parameter space defined by empirical metrics, which will be relevant even after progenitors are observed and detailed models are refined.

2012

Friesen, B., Baron, E., Branch, D., Chen, B., Parrent, J., Thomas, R. C., "Supernova Resonance-scattering Line Profiles in the Absence of a Photosphere", The Astrophysical Journal Supplements Series, 2012, 203:1, doi: 10.1088/0067-0049/203/1/12

In supernova (SN) spectroscopy relatively little attention has been given to the properties of optically thick spectral lines in epochs following the photosphere's recession. Most treatments and analyses of post-photospheric optical spectra of SNe assume that forbidden-line emission comprises most if not all spectral features. However, evidence exists that suggests that some spectra exhibit line profiles formed via optically thick resonance-scattering even months or years after the SN explosion. To explore this possibility, we present a geometrical approach to SN spectrum formation based on the "Elementary Supernova" model, wherein we investigate the characteristics of resonance-scattering in optically thick lines while replacing the photosphere with a transparent central core emitting non-blackbody continuum radiation, akin to the optical continuum provided by decaying 56Co formed during the explosion. We develop the mathematical framework necessary for solving the radiative transfer equation under these conditions and calculate spectra for both isolated and blended lines. Our comparisons with analogous results from the Elementary Supernova code SYNOW reveal several marked differences in line formation. Most notably, resonance lines in these conditions form P Cygni-like profiles, but the emission peaks and absorption troughs shift redward and blueward, respectively, from the line's rest wavelength by a significant amount, despite the spherically symmetric distribution of the line optical depth in the ejecta. These properties and others that we find in this work could lead to misidentification of lines or misattribution of properties of line-forming material at post-photospheric times in SN optical spectra.

Parrent, J. T., Howell, D. A., Friesen, B., Thomas, R. C., Fesen, R. A., Milisavljevic, D., Bianco, F. B., Dilday, B., Nugent, P., Baron, E., Arcavi, I., Ben-Ami, S., Bersier, D., Bildsten, L., Bloom, J., Cao, Y., Cenko, S. B., Filippenko, A. V., Gal-Yam, A., Kasliwal, M. M., Konidaris, N., Kulkarni, S. R., Law, N. M., Levitan, D., Maguire, K., Mazzali, P. A., Ofek, E. O., Pan, Y., Polishook, D., Poznanski, D., Quimby, R. M., Silverman, J. M., Sternberg, A., Sullivan, M., Walker, E. S., Xu, Dong, Buton, C., Pereira, R., "Analysis of the Early-time Optical Spectra of SN 2011fe in M101", The Astrophysical Journal Letters, 2012, 752, doi: 10.1088/2041-8205/752/2/L26

The nearby Type Ia supernova (SN Ia) SN 2011fe in M101 (cz = 241 km s–1) provides a unique opportunity to study the early evolution of a "normal" SN Ia, its compositional structure, and its elusive progenitor system. We present 18 high signal-to-noise spectra of SN 2011fe during its first month beginning 1.2 days post-explosion and with an average cadence of 1.8 days. This gives a clear picture of how various line-forming species are distributed within the outer layers of the ejecta, including that of unburned material (C+O). We follow the evolution of C II absorption features until they diminish near maximum light, showing overlapping regions of burned and unburned material between ejection velocities of 10,000 and 16,000 km s–1. This supports the notion that incomplete burning, in addition to progenitor scenarios, is a relevant source of spectroscopic diversity among SNe Ia. The observed evolution of the highly Doppler-shifted O I λ7774 absorption features detected within 5 days post-explosion indicates the presence of O I with expansion velocities from 11,500 to 21,000 km s–1. The fact that some O I is present above C II suggests that SN 2011fe may have had an appreciable amount of unburned oxygen within the outer layers of the ejecta.

Richard A. Gerber

2016

Richard A Gerber, IXPUG Birds of a Feather Welcome, Birds of a Feather @ SC16, November 16, 2016,

Richard A. Gerber, Success Through Community, Closing Remarks at Intel HPC Developer Conference 2016, November 13, 2016,

R Gerber, J Deslippe, D Doerfler, Many Cores for the Masses: Lessons Learned from Application Readiness Efforts at NERSC for the Knights Landing based Cori System, Intel HPC Developers Conference, November 12, 2016,

Richard A Gerber, Using NERSC for Research in High Energy Physics Theory, Particle Physics Generators @ Fermilab, September 22, 2016,

Richard A Gerber, Application Readiness for KNL at NERSC, IXPUG 2016 Conference, September 20, 2016,

Richard A Gerber, IXPUG 2016 Welcome, IXPUG 2016 Conference, September 19, 2016,

Richard A Gerber, September 2016 NERSC Update, September 7, 2016,

Richard A Gerber, Application Performance on Intel Xeon Phi - Being Prepared for KNL and Beyond, ISC 2016 Workshop Introduction, June 23, 2016,

Richard A Gerber, Cori: Enabling World-Changing Science, Intel Collaboration Hub @ ISC 2016, June 22, 2016,

Gerber, Richard A., et al., "Application Performance on Intel Xeon Phi–Being Prepared for KNL and Beyond", High Performance Computing: ISC High Performance 2016 International Workshops, ExaComm, E-MuCoCoS, HPC-IODC, IXPUG, IWOPH, P3MA, VHPC, Frankfurt, Germany, June 19–23, 2016, Revised Selected Papers. Vol. 9945. Springer, 2016., June 19, 2016,

Gerber, Richard A., et al. "Application Performance on Intel Xeon Phi–Being Prepared for KNL and Beyond." High Performance Computing: ISC High Performance 2016 International Workshops, ExaComm, E-MuCoCoS, HPC-IODC, IXPUG, IWOPH, P3MA, VHPC, Frankfurt, Germany, June 19–23, 2016, Revised Selected Papers. Vol. 9945. Springer, 2016.

Richard A Gerber, NERSC Allocations 2016-17, June 14, 2016,

Richard A Gerber, High Performance Computing and NERSC for High School Students, June 6, 2016,

Ashley Barker, Chris Fusion, Richard Gerber, Yun (Helen) He, Frank Indiviglio, Best Practices for Managing HPC User Documentation and Communication, Cray User Group Meeting 2016, London, England, May 10, 2016,

Salman Habib, Robert Roser (HEP Leads), Richard Gerber, Katie Antypas, Katherine Riley, Tim Williams, Jack Wells, Tjerk Straatsma (ASCR Leads), A. Almgren, J. Amundson, S. Bailey, D. Bard, K. Bloom, B. Bockelman, A. Borgland, J. Borrill, R. Boughezal, R. Brower, B. Cowan, H. Finkel, N. Frontiere, S. Fuess, L. Ge, N. Gnedin, S. Gottlieb, O. Gutsche, T. Han, K. Heitmann, S. Hoeche, K. Ko, O. Kononenko, T. LeCompte, Z. Li, Z. Lukic, W. Mori, P. Nugent, C.-K. Ng, G. Oleynik, B. O'Shea, N. Padmanabhan, D. Petravick, F.J. Petriello, J. Power, J. Qiang, L. Reina, T.J. Rizzo, R. Ryne, M. Schram, P. Spentzouris, D. Toussaint, J.-L. Vay, B. Viren, F. Wurthwein, L. Xiao, "ASCR/HEP Exascale Requirements Review Report", arXiv:1603.09303 [physics.comp-ph], March 31, 2016,

Clayton Bagwell, Richard Gerber, NUG 2016 Business Meeting: Allocations, NUG Business Meeting presentation, March 24, 2016,

NUG (NERSC Users Group) Business meeting: Allocations

Clayton Bagwell, Richard Gerber, NERSC Brown Bag: Allocations, NERSC Brown Bag presentation, March 17, 2016,

Brown Bag presentation to NERSC staff on how Allocations work and the new scavenger queues.

Richard A Gerber, Application Preparedness for Next Generation Computational Systems and Integration with Data-Intensive Workflows, February 26, 2016,

Richard A. Gerber, NERSC Science and Strategic Results, February 16, 2016,

2015

Richard A. Gerber, Katie Antypas, Sudip Dosanjh, Jack Deslippe, Nick Wright, Jay Srinivasan, Systems Roadmap and Plans for Supporting Extreme Data Science, December 10, 2015,

Yun (Helen) He, Alice Koniges, Richard Gerber, Katie Antypas, Using OpenMP at NERSC, OpenMPCon 2015, invited talk, September 30, 2015,

Alice Koniges, Tim Mattson, Yun (Helen) He, Richard Gerber, Enabling Application Portability across HPC Platforms: An Application Perspective, OpenMPCon 2015, invited talk, September 29, 2015,

N.J. Wright, S. S. Dosanjh, A. K. Andrews, K. Antypas, B. Draney, R.S Canon, S. Cholia, C.S. Daley, K. M. Fagnan, R.A. Gerber, L. Gerhardt, L. Pezzaglia, Prabhat, K.H. Schafer, J. Srinivasan, "Cori: A Pre-Exascale Computer for Big Data and HPC Applications", Big Data and High Performance Computing 26 (2015): 82., ( June 2015) doi: 10.3233/978-1-61499-583-8-82

Extreme data science is becoming increasingly important at the U.S. Department of Energy's National Energy Research Scientific Computing Center (NERSC). Many petabytes of data are transferred from experimental facilities to NERSC each year. Applications of importance include high-energy physics, materials science, genomics, and climate modeling, with an increasing emphasis on large-scale simulations and data analysis. In response to the emerging data-intensive workloads of its users, NERSC made a number of critical design choices to enhance the usability of its pre-exascale supercomputer, Cori, which is scheduled to be delivered in 2016. These data enhancements include a data partition, a layer of NVRAM for accelerating I/O, user defined images and a customizable gateway for accelerating connections to remote experimental facilities.

Richard Gerber, Harvey Wasserman, "Large Scale Production Computing and Storage Requirements for Advanced Scientific Computing Research: Target 2017", April 28, 2015,

Richard Gerber, NERSC Science 2014, February 24, 2015,

Richard A. Gerber, Performance and Debugging Tools for HPC, February 17, 2015,

Guest lecture in UC Berkeley CS 267 - Applications of Parallel Computers

Richard Gerber, Harvey Wasserman, "Large Scale Computing and Storage Requirements for Nuclear Physics - Target 2017", January 28, 2015,

2014

Richard A. Gerber, Exascale Computing, Big Data, and World-Class Science at NERSC, November 13, 2014,

Talk given at San Jose State University Physics Department colloquium on Nov. 13, 2014.

Richard A. Gerber et al., "High Performance Computing Operational Review: Enabling Data-Driven Scientific Discovery at DOE HPC Facilities", November 7, 2014,

K. Antypas, B.A Austin, T.L. Butler, R.A. Gerber, C.L Whitney, N.J. Wright, W. Yang, Z Zhao, "NERSC Workload Analysis on Hopper", Report, October 17, 2014, LBNL 6804E,

Richard A. Gerber, Harvey J. Wasserman, "Large Scale Computing and Storage Requirements for Basic Energy Sciences: Target 2017", October 10, 2014,

Richard A. Gerber et al., "DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at DOE HPC Facilities", September 17, 2014,

Richard A. Gerber, Harvey J. Wasserman, "Large Scale Computing and Storage Requirements for Fusion Energy Sciences: Target 2017", May 1, 2014,

Richard A. Gerber, Support for Astronomy and Astrophysics at NERSC, April 3, 2014,

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

Richard A. Gerber, Helen He, Woo-Sun Yang, Debugging and Optimization Tools, Presented at UC Berkeley CS267 class, February 2014, February 19, 2014,

Richard A. Gerber, NERSC, NERSC 2013 User Survey Results, February 6, 2014,

Richard A. Gerber, NERSC, NERSC Requirement Reviews for NUG 2014, February 6, 2014,

Kim Cupps, et al., "High Performance Computing Operations Review Report", January 6, 2014,

2013

Richard A. Gerber, Communicating with NERSC Communities, November 20, 2013,

Richard A Gerber, Zhengji Zhao, NERSC Job Data, November 20, 2013,

Richard A. Gerber, "NERSC’s Edison Delivers High-Impact Science Results From Day One", November 2013,

NERSC’s Edison Delivers High-Impact Science Results From Day One, SC13 display poster.

Richard A. Gerber, "Bringing Better Materials to Market in Half the Time", November 2013,

Bringing Better Materials to Market in Half the Time, SC13 poster

Richard Gerber (Berkeley Lab), Ken Bloom (U. Nebraska-Lincoln), "Report of the Snowmass 2013 Computing Frontier Working Group on Distributed Computing and Facility Infrastructures", to be included in proceedings of Community Summer Study ("Snowmass") 2013, November 11, 2013,

A. Koniges, R. Gerber, D. Skinner, Y. Yao, Y. He, D. Grote, J-L Vay, H. Kaiser, and T. Sterling, "Plasma Physics Simulations on Next Generation Platforms", 55th Annual Meeting of the APS Division of Plasma Physics, Volume 58, Number 16, November 11, 2013,

The current high-performance computing revolution provides opportunity for major increases in computational power over the next several years, if it can be harnessed. This transition from simply increasing the single-processor and network performance to a different architectural paradigms forces application programmers to rethink the basic models of parallel programming from both the language and problem division standpoints. One of the major computing facilities available to researchers in fusion energy is the National Energy Research Scientific Computing Center. As the mission computing center for DOE, Office of Science, NERSC is tasked with helping users to overcome the challenges of this revolution both through the use of new parallel constructs and languages and also by enabling a broader user community to take advantage of multi-core performance. We discuss the programming model challenges facing researchers in fusion and plasma physics in for a variety of simulations ranging from particle-in-cell to fluid-gyrokinetic and MHD models.

Richard Gerber and Harvey Wasserman, eds., "Large Scale Computing and Storage Requirements for High Energy Physics - Target 2017", November 8, 2013,

Richard Gerber, NUG Webinar for November 2013, November 7, 2013,

Richard Gerber, Edison Overview (Focus on hardware relevant for performance), October 10, 2013,

Richard Gerber, Data-Driven Science at NERSC, August 8, 2013,

Richard Gerber, High-Performance Parallel I/O, August 6, 2013,

Richard Gerber, NERSC/Berkeley Lab and Ken Bloom, University of Nebraska-Lincoln, Snowmass Computing Frontier I2: Distributed Computing and Facility Infrastructures, July 31, 2013,

Richard Gerber, Dirac Science Highlights 2013, June 25, 2013,

Richard Gerber, Introduction to High Performance Computing, June 10, 2013,

Introduction to High Performance Computing presented to Berkeley Lab Computing Sciences summer interns.

Richard Gerber, Harvey Wasserman, "High Performance Computing and Storage Requirements for Biological and Environmental Research Target 2017", June 6, 2013, LBNL LBNL-6256E,

Richard A. Gerber, HIgh Performance Computing and Big Data (for High Energy Physics theory), April 2, 2013,

Richard A. Gerber, Fusion Energy Sciences Requirements Review Overview and Goals, March 19, 2013,

Richard Gerber, NUG March 2013 Webinar, March 7, 2013,

NERSC User Group Teleconference and Webinar Slides for March 7, 2013

Richard A. Gerber, Debugging and Optimization Tools, February 19, 2013,

Debugging and Optimization Tools, presented for UC Berkeley CS267 "Applications of Parallel Computers" class, Feb. 19, 2013.

Richard Gerber, Trends, Discovery, & Innovation at NUG User Day 2013, February 19, 2013,

Richard A. Gerber, Tina Declerck. Zhengji Zhao, Edison Update, February 12, 2013,

Overview and update on the installation and configuration of Edison, NERSC's new Cray XC30 supercomputer.

Richard A. Gerber, Harvey Wasserman, NERSC Requirements Reviews, February 12, 2013,

An update on the NERSC Requirements Reviews at NUG 2013. Richard Gerber and Harvey Wasserman, NERSC>

Richard Gerber, Requirements Reviews Update, February 12, 2013,

Richard A. Gerber, Getting Started at NERSC, January 17, 2013,

Getting Started at NERSC Webinar, January 17, 2013, Richard Gerber, NERSC User Services

Richard Gerber, Kathy Yelick, Lawrence Berkeley National Laboratory, "Data Requirements from NERSC Requirements Reviews", January 9, 2013,

2012

Richard A. Gerber, Job Analytics, November 8, 2012,

Richard A. Gerber, Batch Strategies of Maximizing Throughput and Allocation, NERSC Users Group Monthly Webinar, October 2012, October 4, 2012,

Richard A. Gerber, Uses for High Performance Computing, June 12, 2012,

Who uses High Performance Computing and what do they do with it? Presented for LBNL Summer Interns, June 12, 2012.

Richard A. Gerber, Introduction to High Performance Computers, June 12, 2012,

Introduction for High Performance Computers. Presented to LBNL Summer Interns, June 12, 2012.

Richard A. Gerber, Challenges in HPC, June 12, 2012,

Challenges in High Performance Computing. Presented to LBNL Summer Interns, June 12, 2012.

Richard A. Gerber, Harvey J. Wasserman, "Large Scale Computing and Storage Requirements for Nuclear Physics", Workshop, March 26, 2012, LBNL LBNL-5355E,

Report of the user requirements workshop for lattice gauge theory and nuclear physics computation at NERSC that took place May 26, 2011

Debugging and Optimization Tools, Presented to UC Berkeley CS 267 Class, Applications of Parallel Computer, February 16, 2012,

User Requirements Gathered for the NERSC7 Procurement, February 3, 2012,

Richard A. Gerber, Harvey J. Wasserman, "Large Scale Computing and Storage Requirements for Advanced Computational Science Research", Workshop, January 2012, LBNL LBNL-5249E,

2011

Richard A. Gerber, Harvey J. Wasserman, "Large Scale Computing and Storage Requirements for Fusion Energy Sciences", Workshop, December 2011,

HPC I/O in Scaling  to  Petascale  and  Beyond:  Performance  Analysis  and  Optimization  of  Applications, SC11, November 13, 2011,

Richard A. Gerber, Introduction to HPC Systems, NERSC New User Training, September 13, 2011,

Richard A. Gerber, Experiences with Tools at NERSC, Programming weather, climate, and earth-­‐system models on heterogeneous mul-­‐core platforms, NCAR, Boulder, CO, September 7, 2011,

NERSC Overview for Environmental Energy Technologies, Berkeley Lab, Environmental Energy Technologies Division, Berkeley, CA, June 2011,

Richard A. Gerber, Harvey J. Wasserman, "Large Scale Computing and Storage Requirements for Basic Energy Sciences", Workshop, June 10, 2011, LBNL LBNL-4809E,

Getting Started at NERSC, NERSC Training Webinar Berkeley Lab OSF, Oakland, CA, June 7, 2011,

NERSC Overview for the Joint Genome Institute, DOE Joint Genome Institute, Walnut Creek, CA, May 2, 2011,

Richard Gerber, Computers - BSA Merit Badge, March 2011,

Richard Gerber, Debugging and Optimization Tools, Presented to CS267 class at UC-Berkeley, February 11, 2011,

2010

Richard A. Gerber, Harvey J. Wasserman, "Large Scale Computing and Storage Requirements for High Energy Physics", Workshop, November 15, 2010,

Richard A. Gerber, Large Scale Computing and Storage Requirements for High Energy Physics, NUG 2010, Cray Quarterly, NERSC OSF, Oakland, CA, October 28, 2010,

Richard A. Gerber, Introduction to Computing at NERSC, October 20, 2010,

Submitting and Running Jobs on the Cray XT5, Joint XT Workshop, Berkeley, CA, February 1, 2010,

2009

Richard A. Gerber, Harvey J. Wasserman, "Large Scale Computing and Storage Requirements for Biological and Environmental Research", Workshop, October 19, 2009, LBNL LBNL-2710E,

Computing and Storage Requirements for Biological and Environmental Research, NUG 2009, Boulder, CO, October 7, 2009,

W. Allcock, R. Carlson, S. Cotter, E. Dart, V. Dattoria, B. Draney, R. Gerber, M. Helm, J. Hick, S. Hicks, S. Klasky, M. Livny, B. Maccabe, C. Morgan, S. Morss, L. Nowell, D. Petravick, J. Rogers, Y. Sekine, A. Sim, B. Tierney, S. Turnbull, D. Williams, L. Winkler, F. Wuerthwein, "ASCR Science Network Requirements", Workshop, April 15, 2009,

ESnet publishes reports from Network and Science Requirement Workshops on a regular basis.  This report was the product of a two-day workshop in Washington DC that addresses science requirements impacting operations of networking for 2009.

Richard A. Gerber, Breakthrough Science at NERSC, Cray Technical Workshop, Isle of Pines, SC, February 25, 2009,

2008

Richard A. Gerber, Franklin File Systems and I/O, NERSC Users Group, Oakland, CA, October 2, 2008,

Richard Gerber, "Franklin Interactive Node Responsiveness", June 9, 2008,

Richard A. Gerber, Performance Monitoring on NERSC’s POWER 5 System, May 22, 2008,

2007

Richard A. Gerber, Running Jobs on Franklin, September 19, 2007,

Richard A. Gerber, Application and System Memory Use, Configuration, and Problems on Bassi, July 17, 2007,

2006

Richard A. Gerber, Bassi - NERSC's New IBM POWER 5, NUG 2006, Princeton, NJ, June 13, 2006,

Richard A. Gerber, Experiences Configuring, Validating, and Monitoring Bassi, NUG 2006, Princeton, NJ, June 13, 2006,

2000

Nathan C. Hearn, Susan A. Lamb, Robert A. Gruendl (Univ. of Ill.), Richard A. Gerber (NERSC/Berkeley Lab), "The Colliding Galaxy System Arp 119: Numerical Models and Infrared Observations", Astronomical Society of the Pacific Conference Series, January 17, 2000, 215:46H,

1999

Lamb, S. A.; Hearn, N. C.; Gerber, R. A., "Velocity Fields in Impacted Gaseous Galactic Discs: A Numerical Survey using N-body/SPH Simulations", Bulletin of the American Astronomical Society, May 17, 1999, 31:829,

1997

Lamb, S. A.; Gerber, R. A.; Rudnick, G. H.; Dewing, M., "Starbursts in Collisionally Produced Ring Galaxies: Comparisons Between Numerical Models and Observed Systems", Revista Mexicana de Astronomia y Astrofisica Serie de Conferencias, May 3, 1997, 6:151,

1996

Richard A. Gerber, "Global Effects of Softening n-Body Galaxies", Astrophysical Journal, August 1, 1996, 466:724-731,

Smith, B.F.; Gerber, R.A.; Steiman-Cameron, T.Y.; Miller, R.H., "The Response of Disks to Oscillatory Modes in Galaxies", Bulletin of the American Astronomical Society, June 17, 1996, 28:1189,

Rudnick, G. H.; Lamb, S. A.; Gerber, R. A.; Dewing, M., "A Comparison between Numerical Models of Collisionally Produced Ring Galaxies and Observed Systems", Bulletin of the American Astronomical Society, May 13, 1996, 28:826,

Gerber, R.A. & Lamb, S.A., "A Stellar and Gas Dynamical Numerical Model of Ring Galaxies", Monthly Notices of the Royal Astronomical Society, January 8, 1996, 278:345,

1995

Gerber, R. A.; Smith, B. F.; Steiman-Cameron, T. Y., "The Response of Disks to Oscillatory Modes in Galaxies", Bulletin of the American Astronomical Society, December 11, 1995, 27:1353,

Gerber, Richard A, "Some Consequences of Using Gravitational Softening to Model N-body Galaxies", Bulletin of the American Astronomical Society, June 12, 1995, 27:1201,

1994

Lamb, S. A.; Gerber, R. A.; Balsara, D. S., "Off-Center Collisions Involving Rotating Disk Galaxies: 3-D Numerical Simulations", Bulletin of the American Astronomical Society, December 12, 1994, 26:1430,

Richard A. Gerber and Susan A. Lamb, "A Model for Collisionally Induced Disturbed Structure in Disk Galaxies", Astrophysical Journal, August 20, 1994, 431:604-616,

Susan A. Lamb, NORDITA and Neils Bohr Institute; Richard A. Gerber University of Illinois at Urbana-Champaign; and Dinshaw Balsara, Johns Hopkins University, "Galactic Scale Gas Flows in Colliding Galaxies: a-Dimensional, N-bodyjHydrodynamics Experiments", Astrophysics and Space Science, May 1994, 216:337-346,

Gerber, Richard A., "Some Consequences of Using Gravitational Softening to Model N-body Galaxies", Bulletin of the American Astronomical Society, March 1994, 27:885,

1993

Stellar and gas dynamics of interacting ring galaxies, Richard A. Gerber, Ph.D., October 1, 1993,

Susan A. Lamb, Richard A. Gerber, Dinshaw S. Balsara, "Galaxies in Collision: The Formation of Rings and Arcs", Astron. Ges., Abstr. Ser, June 1, 1993, 8:56,

Stellar and Gas Dynamics of Interacting Ring Galaxies: Front Material, Richard A. Gerber, May 1993,

Stellar and Gas Dynamics of Interacting Ring Galaxies: Chapter 1 - Introduction, Richard A. Gerber, May 1993,

1992

Richard A. Gerber, Susan A. Lamb, and Dinshaw Balsara, "A Model of Ring Galaxies: Arp 147-Like Systems", Astrophysical Journal, November 1, 1992, 399:L51-L54,

Gerber, R. A.; Lamb, S. A., "Models for Collisionally Induced Disturbed Structure in Disk Galaxies", Bulletin of the American Astronomical Society, May 18, 1992, 24:811,

1991

Gerber, R. A.; Lamb, S. A.; Balsara, D. S., "A Model for Ring Galaxies: Arp 147-Like Systems", Bulletin of the American Astronomical Society, September 17, 1991, 23:1391G,

R.A. Gerber, S.A. Lamb (Univ. of Ill.), and D.S. Balsara (JHU), "Ring Formation in Off-Axis Collisions of Galaxies", Bulletin of the American Astronomical Society, March 11, 1991, BAAS 23:953G,

1990

Gerber, Richard A.; Balsara, Dinshaw S.; Lamb, Susan A, "Dynamical experiments on models of colliding disk galaxies", NASA, Marshall Space Flight Center, Paired and Interacting Galaxies: International Astronomical Union Colloquium, November 12, 1990, 124:737-742,

R.A. Gerber, S.A. Lamb (Univ. of Illinois at Urbana-Champaign), D.S. Balsara (Johns Hopkins Univ.), "Combined Hydrodynamical and N-Body Studies of Colliding Galaxies: The Formation of Ring Galaxies", 177th Meeting of the American Astronomical Society, July 9, 1990, 22:1243G,

Gerber, R. A. and Lamb, S. A. and Miller, R. H., and Smith, B. F., "Models of colliding galaxies: kinetic energy and density enhancement", Dynamics and Interactions of Galaxies, Proceedings, Springer, Berlin (Germany, F.R.), June 4, 1990, 223,

Gerber. R. A.; Lamb, S. A.; Miller, R. H.; Smith, B. F., "Potential Sites for Star Formation in Colliding Galaxies", Astrophysics and Space Science Library, Workshop of the Advanced School of Astronomy, February 12, 1990, 160:366G,

Lamb, S. A.; Miller, R. H.; Gerber, R. A.; Smith, B. F., "Models of Colliding Galaxies: Kinetic Energy and Density Enhancements", Astrophysics and Space Science Library, Kona Symposium of Millimetre and Submillimetre Astronomy, January 8, 1990, 158:235L,

1989

R. A. Gerber, D. S. Balsara, and S. A. Lamb (Univ. of Ill.), "Potential Sites of Star Formation in Interacting Galaxies: Numerical Experiments", Bulletin of the American Astronomical Society, September 11, 1989, 21:1163G,

1988

S.A Lamb, R.A. Gerber (U. Illinois), R.H. Miller (U. Chicago), B.F. Smith (NASA/Ames), "Models of Colliding Galaxies: Kinetic Energy and Density Enhancements", 173rd Meeting of the American Astronomical Society, September 10, 1988,

Lisa Gerhardt

2016

Lisa Gerhardt, Jeff Porter, Nick Balthaser, Lessons Learned from Running an HPSS Globus Endpoint, 2016 HPSS User Forum, September 1, 2016,

The NERSC division of LBNL has been running HPSS in production since 1998. The archive is quite popular with roughly 100TB IO every day from the ~6000 scientists that use the NERSC facility. We maintain a Globus-HPSS endpoint that transfers over 1PB / month of data into and out of HPSS. Getting Globus and HPSS to mesh well can be challenging. This talk gives an overview of some of the lessons learned.

Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon,
Alex Gittens, Lisa Gerhardt, Suren Byna, Mike F. Ringenburg, Prabhat,
"H5Spark: Bridging the I/O Gap between Spark and Scientific Data Formats on HPC Systems", Cray User Group, May 13, 2016,

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

2015

M. G. Aartsen et al., "Flavor Ratio of Astrophysical Neutrinos above 35 TeV in IceCube", Physical Review Letters, February 11, 2015, doi: 10.1103/PhysRevLett.114.171102

2014

Nicholas Balthaser, Lisa Gerhardt, NERSC Archival Storage: Best Practices, Joint Facilities User Forum on Data-Intensive Computing, June 18, 2014,

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

IceCube Collaboration: M. G. Aartsen et al, "Energy Reconstruction Methods in the IceCube Neutrino Telescope", Journal of Instrumentation 9 P03009, March 2014,

M. G. Aartsen et al., "Search for non-relativistic Magnetic Monopoles with IceCube", European Physics Journal, February 14, 2014, doi: 10.1140/epjc/s10052-014-2938-8

Nick Balthaser, NERSC; Lisa Gerhardt, NERSC, Introduction to NERSC Archival Storage: HPSS, February 3, 2014,

IceCube Collaboration: M. G. Aartsen et al, "Improvement in Fast Particle Track Reconstruction with Robust Statistics", Nuclear Instruments and Methods A736 143-149, February 2014,

2013

IceCube Collaboration: M. G. Aartsen et al, "Search for Time-Independent Neutrino Emission from Astrophysical Sources with 3 yr of IceCube Data", Astrophysical Journal 779 132, December 2013,

IceCube Collaboration: M. G. Aartsen et al, "Probing the Origin of Cosmic Rays with Extremely High Energy Neutrinos Using the IceCube Observatory", Physical Review D88 112008, December 2013,

IceCube Collaboration: M. G. Aartsen et al, "An IceCube Search for Dark Matter Annihilation in Nearby Galaxies and Galaxy Clusters", Physical Review D88 122001, December 2013,

IceCube Collaboration: M. G. Aartsen et al, "Evidence for High-Energy Extraterrestrial Neutrinos at the IceCube Detector", Science 342 1242856, November 2013,

IceCube Collaboration: M. G. Aartsen et al, "South Pole Glacial Climate Reconstruction from Multi-Borehole Laser Particulate Stratigraphy", Journal of Glaciology 59 1117-1128, October 2013,

IceCube Collaboration: M. G. Aartsen et al, "Measurement of the Cosmic Ray Energy Spectrum with IceTop-73", Physical Review D88 042004, August 28, 2013,

IceCube Collaboration: R. Abbasi et al, "Measurement of Atmospheric Neutrino Oscillations with IceCube", Physical Review Letters 111 081801, August 2013,

IceCube Collaboration: R. Abbasi et al, "First Observation of PeV-energy Neutrinos with IceCube", Physical Review Letters 111 021103, July 2013,

IceCube Collaboration: M. G. Aartsen et al., "Measurement of South Pole Ice Transparency with the IceCube LED Calibration System", Nuclear Instruments and Methods A711 73-89, May 2013,

Kevin Gott

2015

This research investigates thermal conductivity properties of Cu/Zr alloys combined with diamond particles to form a composite that possess superior thermal conductivity. This article describes the use of a theoretical calculation and finite element analysis to compare to previously published experimental observations. Both theoretical calculations and finite element analysis indicate that experimental results cannot be explained solely by an improved interface between the matrix and diamond particles, as originally suggested. This study shows that the experimental results, theoretical calculations, and finite element analysis are in agreement when the thermal conductivity of the matrix is adjusted to compensate for the amount of zirconium lost to the interface. This finding can be used to predict the thermal conductivity of a composite material composed of a Cu/Zr matrix with diamond particles.

The fabrication, testing and modeling of thermal annealed pyrolytic graphite (TPG) encapsulated heat spreaders was explored for potential use in the cooling of microelectronic devices. The 60 mm diameter, 5 mm thick heat spreaders were created using field-assisted sintering technology (FAST). The TPG encapsulated heat spreaders were compared to their simple aluminum and copper versions through both experimental measurements and numerical calculations. The results show that TPG encapsulated heat spreaders yield lower and more uniform surface temperatures when exposed to identical heating conditions. Heat spreaders such as these should be considered for cooling the next generation of high power density microelectronic devices.

2013

Kevin Gott, Anil Kulkarni, Jogender Singh, "A Comparison of Continuum, DSMC and Free Molecular Modeling Techniques for Physical Vapor Deposition", 2013 ASME International Mechanical Engineering Congress and Exposition, 2013, IMECE201, doi: 10.1115/IMECE2013-66433

Advanced Physical Vapor Deposition (PVD) techniques are available that produce thin-film coatings with adaptive nano-structure and nano-chemistry. However, such components are manufactured through trial-and-error methods or in repeated small increments due to a lack of adequate knowledge of the underlying physics. Successful computational modeling of PVD technologies would allow coatings to be designed before fabrication, substantially improving manufacturing potential and efficiency.

Previous PVD modeling efforts have utilized three different physical models depending on the expected manufacturing pressure: continuum mechanics for high pressure flows, Direct Simulation Monte Carlo (DSMC) modeling for intermediate pressure flows or free-molecular (FM) dynamics for low pressure flows. However, preliminary calculations of the evaporation process have shown that a multi-physics fluidic solver that includes all three models may be required to accurately simulate PVD coating processes. This is due to the high vacuum and intermolecular forces present in vapor metals which cause a dense continuum region to form immediately after evaporation and expands to a rarefied region before depositing on the target surface.

This paper seeks to understand the effect flow regime selection has on the predicted deposition profile of PVD processes. The model is based on experiments performed at the Electron-Beam PVD (EB-PVD) laboratory at the Applied Research Lab at Penn State. CFD, DSMC and FM models are separately used to simulate a coating process and the deposition profiles are compared. The mass deposition rates and overall flow fields of each model are compared to determine if the underlying physics significantly alter the predicted coating profile. Conclusions are drawn on the appropriate selection of fluid physics for future PVD simulations.

2011

Kevin Gott, Anil Kulkarni, Jogender Singh, "Construction and Application of Unified Flow Solver for use with Physical Vapor Deposition (PVD) Modeling", 6th OpenFOAM Workshop, June 13, 2011,

Kevin Gott, Anil Kulkarni, Jogender Singh, Multi-Regime Computational Flow Modeling of Vapor Transport Mechanism of Physical Vapor Deposition (PVD) Coating Manufacturing Processes, Penn State College of Engineering Research Symposium (CERS) 2011, 2011,

2010

Kevin Gott, Anil Kulkarni, Jogender Singh, "The Effect of Flow Regime Selection on Physical Vapor Deposition Flow Modeling", Penn State Graduate Student Exhibition, 2010,

Kevin Gott, Anil Kulkarni, Jogender Singh, "A New Near-Saturated Equation of State for Titanium Vapor for use in Models Simulating Physical Vapor Deposition (PVD) Processes", 20th National and 9th International ISHMT-ASME Heat and Mass Transfer Conference, 2010,

Standard physical vapor deposition models are analyzed to determine if any of the basic assumptions fail to accurately describe the flow field. Interestingly, the most basic assumption of ideal gas behavior appears to incorrectly convey the physics of PVD fabrication. Even though at first glance the ideal gas approximation seems to be a reasonable assumption given the low pressure/high temperature condition of the flow, recent research into the thermodynamic properties of titanium vapor indicates a very different behavior. Calculation of compressibility factors required to fit the thermodynamic data to other common equations of state such as Van der Waals, Dieterici, Berthelot, and Redlich-Kwong equations of state also showed unexpected behavior. Therefore, a new equation of state is suggested in this paper to more accurately describe titanium vapor and other similar vaporized metals near their saturated state. Properties calculated from this equation of state match the available thermodynamic data well.

2009

Kevin Gott, Anil Kulkarni, Jogender Singh, "A Combined Rarefied and Continuum Flow Regime Model for Physical Vapor Deposition (PVD) Manufacturing Processes", Proceedings of the ASME International Mechanical Engineering Congress and Exposition 2009, 2009, 15-21,

Several modifications to physical vapor deposition (PVD) models are proposed to address the deficiencies in current theoretical studies. Simple calculations show that the flow regime of PVD fabrications will most likely vary from a continuum flow to a rarefied flow in the vacuum chamber as the vapor cloud expands toward the substrate. The flow regime for an evaporated ideal gas is calculated and then an improved equation of state is constructed and analyzed that more accurately describes vaporized metals. The result, combined with experimental observations, suggests PVD fabrication is best represented by a multi-regime flow. Then, a CFD analysis is summarized that further validates the multi-regime analysis hypothesis. Finally, a methodology for constructing and implementing the results of a theoretical multi-regime PVD model is presented.

A D Barrett, K N Gott, J M Barrett, D J Barrett, D T Rusk, "Sensitivity of host-seeking nymphal lone star ticks (Acari: Ixodidae) to immersion in heated water", Journal of Medical Entomology, 2009, 46(5):1240-1243, doi: 10.1603/033.046.0537

Host-seeking nymphal Amblyomma americanum (L.) (Acari: Ixodidae) were placed into heated water, and their survival or their torpidity was recorded as a function of exposure time. Exposures were determined that either kill the nymphs or affect their mobility. All nymphs died when exposed for a minute or more to a temperature > 51 degrees C. Nearly all nymphs remained motionless for a period of time when exposed for 3 min to a temperature > 44 degrees C.

Annette M. Greiner

2016

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

2015

Annette Greiner, Trent Northen, Suzanne Kosina, Richard Baran, Benjamin Bowen, Stefan Jenkins, Tami Swenson, "Seeing the Web of Microbes", Visualization in Data Science (VDS) at IEEE Vis 2015, October 26, 2015,

Prabhat, Kris Bouchard, Annette Greiner, Oliver Ruebel, Peter Denes, Alex Bujan, Sean Mackesey, Jesse Livezey, Jeff Teeters, Fritz Sommer, Eddie Chang, "Supporting Experimental Neuroscience @ NERSC", MSRI workshop on Neural Computation, October 7, 2015,

"BigNeuron", Prabhat, Kris Bouchard, Shreyas Cholia, Annette Greiner, NERSC Science Highlight, March 31, 2015,

2013

Oliver Rübel, Annette Greiner, Shreyas Cholia, Katherine Louie, E. Wes Bethel, Trent R. Northen, Benjamin P. Bowen, "OpenMSI: A High-Performance Web-Based Platform for Mass Spectrometry Imaging", Analytical Chemistry, 2013, 85 (21), pp 10354–10361, October 2, 2013, doi: 10.1021/ac402540a

Taylor Groves

2016

Taylor Groves, "Characterizing and Improving Power and Performance in HPC Networks (Doctoral Showcase)", Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, November 1, 2016,

Matthew GF Dosanjh, Taylor Groves, Ryan E Grant, Ron Brightwell, Patrick G Bridges, "RMA-MT: a benchmark suite for assessing MPI multi-threaded RMA performance", Cluster, Cloud and Grid Computing (CCGrid), 2016 16th IEEE/ACM International Symposium on, IEEE, September 1, 2016, 550--559,

Taylor Groves, Improving Power and Performance in HPC Networks, AMD Research - Austin, June 10, 2016,

Taylor Groves, Ryan Grant, Dorian Arnold, "Network-induced Memory Contention.", Salishan Conference on High Speed Computing, Gleneden Beach, OR,, April 1, 2016,

Taylor Groves, Ryan E Grant, Scott Hemmer, Simon Hammond, Michael Levenhagen, Dorian C Arnold, "(SAI) Stalled, Active and Idle: Characterizing Power and Performance of Large-Scale Dragonfly Networks", Cluster Computing (CLUSTER), 2016 IEEE International Conference on, January 1, 2016, 50--59,

Taylor Groves, Ryan E Grant, Dorian Arnold, "NiMC: Characterizing and eliminating network-induced memory contention", Parallel and Distributed Processing Symposium, 2016 IEEE International, January 1, 2016, 253--262,

2015

Taylor Groves, Samuel K Gutierrez, Dorian Arnold, "A LogP Extension for Modeling Tree Aggregation Networks", Cluster Computing (CLUSTER), 2015 IEEE International Conference on, 2015, 666--673,

Taylor Groves, Ryan Grant, "Power Aware, Dynamic Provisioning of HPC Networks", Sandia National Labs report, 2015,

2014

Taylor Groves, Kurt B Ferreira, "BALANCING POWER AND TIME OF MPI OPERATIONS", CCR, 2014,

2013

Taylor Groves, Dorian Arnold, Yihua He, "In-network, Push-based Network Resource Monitoring: Scalable, Responsive Network Management", Proceedings of the Third International Workshop on Network-Aware Data Management, 2013, 8,

Joshua D Goehner, Taylor L Groves, Dorian C Arnold, Dong H Ahn, Gregory L Lee, "An Optimal Algorithm for Extreme Scale Job Launching", Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE International Conference on, 2013, 1115--1122,

2009

Taylor Groves, Jeff Knockel, Eric Schulte, "BFS vs CFS scheduler comparison", 2009,

Xiao Chen, Jian Shen, Taylor Groves, Wu Jie, "Probability Delegation Forwarding in Delay Tolerant Networks", Computer Communications and Networks, 2009. ICCCN 2009. Proceedings of 18th Internatonal Conference on, IEEE, January 1, 2009,

Chris Harris

2015

Chris Harris, Patrick O Leary, Michael Grauer, Aashish Chaudhary, Chris Kotfila, Robert O Bara, "Dynamic Provisioning and Execution of HPC Workflows Using Python", 2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC), 2015, 1-8, doi: 10.1109/PyHPC.2016.005

Marcus D. Hanwell, Wibe A. de Jong, Christopher J. Harris, "Open chemistry: RESTful web APIs, JSON, NWChem and the modern web application", Journal of Cheminformatics, 2015, 9:55, doi: 10.1186/s13321-017-0241-z

Patrick O Leary, Mark Christon, Sebastien Jourdain, Chris Harris, Markus Berndt, Andrew Bauer, "HPCCloud: A Cloud/Web-Based Simulation Environment", 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom), 2015, 25-33, doi: 10.1109/CloudCom.2015.33

Rebecca Hartman-Baker

2016

Robert J. Harrison, Gregory Beylkin, Florian A. Bischoff, Justus A. Calvin, George I. Fann, Jacob Fosso-Tande, Diego Galindo, Jeff R. Hammond, Rebecca Hartman-Baker, Judith C. Hill, Jun Jia, Jakob S. Kottmann, M-J. Yvonne Ou, Laura E. Ratcliff, Matthew G. Reuter, Adam C. Richie-Halford, Nichols A. Romero, Hideo Sekino, William A. Shelton, Bryan E. Sundahl, W. Scott Thornton, Edward F. Valeev, Álvaro Vázquez-Mayagoitia, Nicholas Vence, Yukina Yokoi, "MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation", SIAM Journal on Scientific Computing, October 27, 2016, 38:S123-S142,

Rebecca J. Hartman-Baker, Past, Present, and Future Parallel Programming Paradigms, March 24, 2016,

2014

Rebecca J. Hartman-Baker, Daniel J. Grimwood, Valerie Maxville, "Evaluating Parallel Programming Tools to Support Code Development for Accelerators", Procedia Computer Science, 2014, 2076-2079,

2011

Rebecca J. Hartman-Baker, Hai Ah Nam, "Optimizing Nuclear Physics Codes on the XT5", Proceedings of CUG 2011, 2011,

Damian Hazen

2012

Damian Hazen, Jason Hick, "MIR Performance Analysis", June 12, 2012, LBNL LBNL-5896E,

 

We provide analysis of Oracle StorageTek T10000 Generation B (T10KB) Media Information Record (MIR) Per- formance Data gathered over the course of a year from our production High Performance Storage System (HPSS). The analysis shows information in the MIR may be used to improve tape subsystem operations. Most notably, we found the MIR information to be helpful in determining whether the drive or tape was most suspect given a read or write error, and for helping identify which tapes should not be reused given their history of read or write errors. We also explored using the MIR Assisted Search to order file retrieval requests. We found that MIR Assisted Search may be used to reduce the time needed to retrieve collections of files from a tape volume. 

 

2011

N. Balthaser, D. Hazen, "HSI Best Practices for NERSC Users", May 2, 2011, LBNL 4745E,

 

In this paper we explain how to obtain and install HSI, create a NERSC authentication token, and transfer data to and from the system. Additionally we describe methods to optimize data transfers and avoid common pitfalls that can degrade data transfers and storage system performance.

 

D. Hazen, J. Hick, W. Hurlbert, M. Welcome, Media Information Record (MIR) Analysis, LTUG 2011, April 19, 2011,

Presentation of Storage Systems Group findings from a year-long effort to collect and analyze Media Information Record (MIR) statistics from our in-production Oracle enterprise tape drives at NERSC.  We provide information on the data collected, and some highlights from our analysis. The presentation is primarily intended to declare that the information in the MIR is important to users or customers to better operating and managing their tape environments.

2010

D. Hazen, J. Hick, HPSS v8 Metadata Conversion, HPSS 8.1 Pre-Design Meeting, April 7, 2010,

Provided information about the HPSS metadata conversion software to other developers of HPSS.  Input was important to establishing a design for the version 8 HPSS metadata conversions.

Yun Helen He

2016

Yun (Helen) He, CESM MG2/HOMME, NESAP Hackathon Meeting at NERSC, Berkeley, CA., November 29, 2016,

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, OpenMP Parallelization and Optimization of Graph-based Machine Learning Algorithms, IWOMP 2016, October 6, 2016,

Yun (Helen) He, Process and Thread Affinity with MPI/OpenMP on KNL, Intel Xeon Phi User Group (IXPUG) 2016 Annual US Meeting, September 22, 2016,

IXPUG2016 event web page: https://www.ixpug.org/events/ixpug-2016

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, "OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms", Lecture Notes in Computer Science, Springer, 2016, 9903:17-31, doi: 10.1007/978-3-319-45550-1_2

Yun (Helen) He, NERSC Early KNL Experiences, NCAR Multi-core 6 Workshop, Boulder, CO., September 13, 2016,

Multi-Core Workshop event web page: https://www2.cisl.ucar.edu/events/workshops/multicore-workshop/2016/2016-agenda

Jeremy Kemp, Alice Koniges, Yun (Helen) He, and Barbara Chapman, "Advanced Programming Model Constructs Using Tasking on the Latest NERSC (Knights Landing) Hardware", CS Summer Student Poster Session, August 4, 2016,

Ahana Roy Choudhury, Yun (Helen) He, and Alice Koniges, "Advanced OpenMP Constructs, Tuning, and Tools at NERSC", CS Summer Student Poster Session, August 4, 2016,

Yun (Helen) He, Running Jobs on Cori with SLURM, Cori Phase 1 Training, Berkeley, CA, June 14, 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Douglas M. Jacobsen, James F. Botts, and Yun (Helen) He, "SLURM. Our Way.", Cray User Group Meeting 2016, London, England, May 2016,

Douglas M. Jacobsen, James F. Botts, and Yun (Helen) He, SLURM. Our Way., Cray User Group Meeting 2016. London, England., May 12, 2016,

Ashley Barker, Chris Fusion, Richard Gerber, Yun (Helen) He, Frank Indiviglio, Best Practices for Managing HPC User Documentation and Communication, Cray User Group Meeting 2016, London, England, May 10, 2016,

Yun (Helen) He, Wahid Bhimji, Cori: User Update, NERSC User Group Meeting, March 24, 2016,

Yun (Helen) He, Advanced OpenMP and CESM Case Study, NERSC User Group Annual Meeting 2016, Berkeley, CA, March 23, 2016,

Yun (Helen) He, Submitting and Running Jobs, NERSC User Group Meeting 2016, Berkeley, CA, March 21, 2016,

Yun (Helen) He, Climate Applications Support at NERSC, NERSC Climate PIs Telecon, March 16, 2016,

Yun (Helen) He, Cori: User Services Report, NERSC/Cray Quarterly Meeting, February 10, 2016,

Yun (Helen) He, Cori and Edison Queues, NERSC User Group (NUG) Telecon, January 21, 2016,

2015

Yun (Helen) He, NERSC Systems Update, NERSC Climate PIs Telecon, December 4, 2015,

Yun (Helen) He, NERSC Climate Applications, NERSC Climate PIs Telecon, December 4, 2015,

Yun (Helen) He, SLURM Resource Manager is Coming to NERSC, NERSC User Group (NUG) Telecon, November 6, 2015,

Yun (Helen) He, CCE/8.4.0 Beta Feedback from NERSC Users, NERSC/Cray Quarterly Meeting, October 20, 2015,

Yun (Helen) He, Nested OpenMP, NERSC User Group (NUG) Telecon, October 8, 2015,

Yun (Helen) He, Alice Koniges, Richard Gerber, Katie Antypas, Using OpenMP at NERSC, OpenMPCon 2015, invited talk, September 30, 2015,

Alice Koniges, Tim Mattson, Yun (Helen) He, Richard Gerber, Enabling Application Portability across HPC Platforms: An Application Perspective, OpenMPCon 2015, invited talk, September 29, 2015,

Yun (Helen) He, Lessons Learned from Selected NESAP Applications, NCAR Multi-Core 5 Workshop 2015, September 16, 2015,

Yun (Helen) He and CESM MG2 Team, NESAP CESM MG2 Update, NERSC/Cray Quarterly Meeting, July 22, 2015,

Yun (Helen) He and XGC1 Team., NESAP XGC1 Dungeon Update, NERSC/Cray Quarterly Meeting, July 22, 2015,

Yun (Helen) He, OpenMP Basics and MPI/OpenMP Scaling, Tutorial presented to LBNL Computational Research Division postdocs, March 23, 2015,

2014

Yun (Helen) He, Explore MPI/OpenMP Scaling on NERSC Systems, NERSC OpenMP and Vectorization Training, October 28, 2014,

Yun (Helen) He and Nick Cardo, Babbage: the MIC Testbed System at NERSC, NERSC Brown Bag, Oakland, CA, April 3, 2014,

Yun (Helen) He, Performance Analysis Tools and Cray Reveal, NERSC User Group Meeting, Oakland, CA, February 3, 2014,

2013

A. Koniges, R. Gerber, D. Skinner, Y. Yao, Y. He, D. Grote, J-L Vay, H. Kaiser, and T. Sterling, "Plasma Physics Simulations on Next Generation Platforms", 55th Annual Meeting of the APS Division of Plasma Physics, Volume 58, Number 16, November 11, 2013,

The current high-performance computing revolution provides opportunity for major increases in computational power over the next several years, if it can be harnessed. This transition from simply increasing the single-processor and network performance to a different architectural paradigms forces application programmers to rethink the basic models of parallel programming from both the language and problem division standpoints. One of the major computing facilities available to researchers in fusion energy is the National Energy Research Scientific Computing Center. As the mission computing center for DOE, Office of Science, NERSC is tasked with helping users to overcome the challenges of this revolution both through the use of new parallel constructs and languages and also by enabling a broader user community to take advantage of multi-core performance. We discuss the programming model challenges facing researchers in fusion and plasma physics in for a variety of simulations ranging from particle-in-cell to fluid-gyrokinetic and MHD models.

Yun (Helen) He, Adding OpenMP to Your Code Using Cray Reveal, NERSC Performance on Edison Training Event, Oakland, CA, October 10, 2013,

Yun (Helen) He, Using the Cray perftools-lite Performance Measurement Tool, NERSC Performance on Edison Training Event, Oakland, CA, October 10, 2013,

Yun (Helen) He, Programming Environments, Applications, and Documentation SIG, Cray User Group 2013, Napa Valley, CA., May 6, 2013,

Yun (Helen) He, Hybrid MPI/OpenMP Programming, NERSC User Group Meeting 2012, Oakland, CA, February 15, 2013,

Suren Byna, Andrew Uselton, Prabhat, David Knaak, Helen He, "Trillion Particles, 120,000 cores, and 350 TBs: Lessons Learned from a Hero I/O Run on Hopper", Cray User Group Meeting, Best Paper Award., 2013,

2012

Zhengji Zhao, Yun (Helen) He and Katie Antypas, "Cray Cluster Compatibility Mode on Hopper", A paper presented in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, Germany., May 1, 2012,

Zhengji Zhao, Yun (Helen) He and Katie Antypas, Cray Cluster Compatibility Mode on Hopper, A talk in the Cray User Group meeting, April 29-May-3, 2012, Stuttgart, German., May 1, 2012,

Yun (Helen) He and Katie Antypas, "Running Large Jobs on a Cray XE6 System", Cray User Group 2012 Meeting, Stuttgart, Germany, April 30, 2012,

Yun (Helen) He, Programming Environments, Applications, and Documentation SIG, Cray User Group 2012, April 30, 2012,

Zhengji Zhao and Helen He, Using Cray Cluster Compatibility Mode on Hopper, A talk NERSC User Group meeting, Feb 2, 2012, Oakland, CA, February 2, 2012,

Yun (Helen) He and Woo-Sun Yang, Using Hybrid MPI/OpenMP, UPC, and CAF at NERSC, NERSC User Group Meeting 2012, Oakland, CA, February 2, 2012,

2011

Zhengji Zhao and Helen He, Cray Cluster Compatibility Mode on Hopper, A Brown Bag Lunch talk at NERSC, Dec. 8, 2011, Oakland, CA, December 8, 2011,

Helen He, Huge Page Related Issues with N6 Benchmarks on Hopper, NERSC/Cray Quarterly Meeting, October 26, 2011,

Yun (Helen) He and Katie Antypas, Mysterious Error Messages on Hopper, NERSC/Cray Quarterly Meeting, July 25, 2011,

K. Antypas, Y. He, "Transitioning Users from the Franklin XT4 System to the Hopper XE6 System", Cray User Group 2011 Procceedings, Fairbanks, Alaska, May 2011,

The Hopper XE6 system, NERSC’s first peta-flop system with over 153,000 cores has increased the computing hours available to the Department of Energy’s Office of Science users by more than a factor of 4. As NERSC users transition from the Franklin XT4 system with 4 cores per node to the Hopper XE6 system with 24 cores per node, they have had to adapt to a lower amount of memory per core and on- node I/O performance which does not scale up linearly with the number of cores per node. This paper will discuss Hopper’s usage during the “early user period” and examine the practical implications of running on a system with 24 cores per node, exploring advanced aprun and memory affinity options for typical NERSC applications as well as strategies to improve I/O performance.

P. M. Stewart, Y. He, "Benchmark Performance of Different Compilers on a Cray XE6", Fairbanks, AK, CUG Proceedings, May 23, 2011,

There are four different supported compilers on NERSC's recently acquired XE6, Hopper. Our users often request guidance from us in determining which compiler is best for a particular application. In this paper, we will describe the comparative performance of different compilers on several MPI benchmarks with different characteristics. For each compiler and benchmark, we will establish the best set of optimization arguments to the compiler.

Yun (Helen) He, Programming Environments, Applications, and Documentation SIG, Cray User Group Meeting 2011, Fairbanks, AK, May 23, 2011,

Michael Stewart, Yun (Helen) He*, Benchmark Performance of Different Compilers on a Cray XE6, Cray User Group 2011, May 2011,

Katie Antypas, Yun (Helen) He*, Transitioning Users from the Franklin XT4 System to the Hopper XE6 System, Cray User Group 2011, Fairbanks, AK, May 2011,

Yun (Helen) He, Introduction to OpenMP, Using the Cray XE6 Workshop, NERSC., February 7, 2011,

2010

Yun (Helen) He, Introduction to OpenMP, NERSC User Group 2010 Meeting, Oakland, CA, October 18, 2010,

Wendy Hwa-Chun Lin, Yun (Helen) He, and Woo-Sun Yang, "Franklin Job Completion Analysis", Cray User Group 2010 Proceedings, Edinburgh, UK, May 2010,

The NERSC Cray XT4 machine Franklin has been in production for 3000+ users since October 2007, where about 1800 jobs run each day. There has been an on-going effort to better understand how well these jobs run, whether failed jobs are due to application errors or system issues, and to further reduce system related job failures. In this paper, we talk about the progress we made in tracking job completion status, in identifying job failure root cause, and in expediting resolution of job failures, such as hung jobs, that are caused by system issues. In addition, we present some Cray software design enhancements we requested to help us track application progress and identify errors.

 

Yun (Helen) He, User Services SIG (Special Interest Group), Cray User Group Meeting 2010, Edinburgh, UK, May 24, 2010,

Yun (Helen) He, Wendy Hwa-Chun Lin, and Woo-Sun Yang, Franklin Job Completion Analysis, Cray User Group Meeting 2010, May 2010,

2009

Yun (Helen) He, "User and Performance Impacts from Franklin Upgrades", Cray User Group Meeting 2009, Atlanta, GA, May 2009, LBNL 2013E,

The NERSC flagship computer Cray XT4 system "Franklin" has gone through three major upgrades: quad core upgrade, CLE 2.1 upgrade, and IO upgrade, during the past year.  In this paper, we will discuss the various aspects of the user impacts such as user access, user environment, and user issues etc from these upgrades. The performance impacts on the kernel benchmarks and selected application benchmarks will also be presented.

Yun (Helen) He, User and Performance Impacts from Franklin Upgrades, Cray User Group Meeting 2009, May 4, 2009,

James M. Craw, Nicholas P. Cardo, Yun (Helen) He, and Janet M. Lebens, "Post-Mortem of the NERSC Franklin XT Upgrade to CLE 2.1", Cray User Group Meeting 2009, Atlanta, GA, May 2009,

This paper will discuss the lessons learned of the events leading up to the production deployment of CLE 2.1 and the post install issues experienced in upgrading NERSC's XT4 system called Franklin.

 

James M. Craw, Nicholas P. Cardo, Yun (Helen) He, and Janet M. Lebens, Post-Mortem of the NERSC Franklin XT Upgrade to CLE 2.1, Cray User Group Meeting, May 2009,

Helen He, Job Completion on Franklin, NERSC/Cray Quarterly Meeting, April 2009,

Helen He, CrayPort Desired Features, NERSC/Cray Quarterly Meeting, April 2009,

2008

Yun (Helen) He, Franklin Quad Core Update/Differences, NERSC User Group Meeting 2008, October 2008,

Yun (Helen) He, William T.C. Kramer, Jonathan Carter, and Nicholas Cardo, Franklin: User Experiences, CUG User Group Meeting 2008, May 5, 2008,

Yun (Helen) He, William T.C. Kramer, Jonathan Carter, and Nicholas Cardo, "Franklin: User Experiences", Cray User Group Meetin 2008, May 4, 2008, LBNL 2014E,

The newest workhorse of the National Energy Research Scientific Computing Center is a Cray XT4 with 9,736 dual core nodes. This paper summarizes Franklin user experiences from friendly early user period to production period. Selected successful user stories along with top issues affecting user experiences are presented.

 

2007

Yun (Helen) He, "Franklin Early User Report", December 2007,

Helen He, Franklin Overview, NERSC User Group Meeting 2007, September 2007,

Jonathan Carter, Yun (Helen) He, John Shalf, Hongzhang Shan, Erich Strohmaier, and Harvey Wasserman, "The Performance Effect of Multi-Core on Scientific Applications", Cray User Group 2007, May 2007, LBNL 62662,

The historical trend of increasing single CPU performance has given way to roadmap of increasing core count. The challenge of effectively utilizing these multi- core chips is just starting to be explored by vendors and application developers alike. In this study, we present some performance measurements of several complete scientific applications on single and dual core Cray XT3 and XT4 systems with a view to characterizing the effects of switching to multi-core chips. We consider effects within a node by using applications run at low concurrencies, and also effects on node- interconnect interaction using higher concurrency results. Finally, we construct a simple performance model based on the principle on-chip shared resource—memory bandwidth—and use this to predict the performance of the forthcoming quad-core system.

 

Jonathan Carter, Helen He*, John Shalf, Erich Strohmaier, Hongzhang Shan, and Harvey Wasserman, The Performance Effect of Multi-Core on Scientific Applications, Cray User Group 2007, May 2007,

J. Levesque, J. Larkin, M. Foster, J. Glenski, G. Geissler, S. Whalen, B. Waldecker, J. Carter, D. Skinner, H. He, H. Wasserman, J. Shalf, H. Shan, "Understanding and mitigating multicore performance issues on the AMD opteron architecture", March 1, 2007, LBNL 62500,

Over the past 15 years, microprocessor performance has doubled approximately every 18 months through increased clock rates and processing efficiency. In the past few years, clock frequency growth has stalled, and microprocessor manufacturers such as AMD have moved towards doubling the number of cores every 18 months in order to maintain historical growth rates in chip performance. This document investigates the ramifications of multicore processor technology on the new Cray XT4systems based on AMD processor technology. We begin by walking through the AMD single-core and dual-core and upcoming quad-core processor architectures. This is followed by a discussion of methods for collecting performance counter data to understand code performance on the Cray XT3and XT4systems. We then use the performance counter data to analyze the impact of multicore processors on the performance of microbenchmarks such as STREAM, application kernels such as the NAS Parallel Benchmarks, and full application codes that comprise the NERSC-5 SSP benchmark suite. We explore compiler options and software optimization techniques that can mitigate the memory bandwidth contention that can reduce computing efficiency on multicore processors. The last section provides a case study of applying the dual-core optimizations to the NAS Parallel Benchmarks to dramatically improve their performance.1

 

2006

Yun (Helen) He and Chris Ding, "Concurrent Single Executable CCSM with MPH Library", LBNL Report, May 2006,

Y. He, C. Ding, M. Vertenstein, N. Norton, B. Kauffman, A. Craig, and J. Wolfe, "Concurrent Single-Executable CCSM with MPH Library", U.S. Department of Energy Climate Change Prediction Program (CCPP) Science Team Meeting, April 2006,

C. Covey, I. Fung, Y. He, F. Hoffman, and J. John, "Diagnosis and Intercomparison of Climate Models with Interactive Biochemistry", U.S. Department of Energy Climate Change Prediction Program (CCPP) Science Team Meeting, April 2006,

F. Hoffman, I. Fung, J. John, J. Randerson, P. Thornton, J. Foley, N. Mahowald, K. Lindsay, M. Vertenstein, C. Covey, Y. He, W. Post, D. Erickson, and the CCSM Biogeochemistry Working Group., "Terrestrial Biogeochemistry Intercomparison Experiments", U.S. Department of Energy Climate Change Prediction Program (CCPP) Science Team Meeting, April 2006,

Yun He and Chris Ding, MPH: a Library for Coupling Multi-Component Models on Distributed Memory Architectures and its Applications, The 8th International Workshop on Next Generation Climate Models for Advanced High Performance Computing Facilities, February 23, 2006,

Yu-Heng Tseng, Chris Ding, Yun He*, Efficient parallel I/O with ZioLib in Community Atmosphere Model (CAM), The 8th International Workshop on Next Generation Climate Models for Advanced High Performance Computing Facilities, February 2006,

Yun He, Status of Single-Executable CCSM Development, CCSM Software Engineering Working Group Meeting, January 25, 2006,

2005

Y. He and C. H.Q. Ding, "Automatic Multi-Instance Simulations of an Existing Climate Program", Berkeley Atmospheric Sciences Center, Fifth Annual Symposium, October 14, 2005,

Yun He, Chris H.Q. Ding, "Coupling Multi-Component Models with MPH on Distributed Memory Computer Architectures", International Journal of High Performance Computing Applications, August 2005, Vol.19,:329-340,

 

A growing trend in developing large and complex applications on today’s Teraflop scale computers is to integrate stand-alone and/or semi-independent program components into a comprehensive simulation package. One example is the Community Climate System Model which consists of atmosphere, ocean, land-surface and sea-ice components. Each component is semi-independent and has been developed at a different institution. We study how this multi-component, multi-executable application can run effectively on distributed memory architectures. For the first time, we clearly identify five effective execution modes and develop the MPH library to support application development utilizing these modes. MPH performs component-name registration, resource allocation and initial component handshaking in a flexible way.

 

A.P. Craig, R.L. Jacob, B. Kauffman, T. Bettge, J. Larson, E. Ong, C. Ding, and Y. He, "CPL6: The New Extensible, High-Performance Parallel Coupler for the Community Climate System Model", International Journal of High Performance Computing Applications, August 2005, Vol.19,:309-327,

Coupled climate models are large, multiphysics applications designed to simulate the Earth's climate and predict the response of the climate to any changes in forcing or boundarey conditions. The Community Climate System Model (CCSM) is a widely used state-of-art climate model that has released several versions to the climate community over the past ten years. Like many climate models, CCSM employs a coupler, a functional unit that coordinates the exchange of data between parts of the climate system such as the atmosphere and ocean. This paper describes the new coupler, cpl6, contained in the latest version of CCSM, CCSM3. Cpl6 introduces distributed-memory parallelism to the coupler, a class library for important coupler functions, and a standarized interface for component models. Cpl6 is implemented entirely in Fortran90 and uses the Model Coupling Toolkit as the base for most of its classes. Cpl6 gives improved performance over previous versions and scales well on multiple platforms.

Yun He, Status of Single-Executable CCSM Development, CCSM Software Engineering Working Group Meeting, March 15, 2005,

H.S. Cooley, W.J. Riley, M.S. Torn, and Y. He, "Impact of Agricultural Practice on Regional Climate in a Coupled Land Surface Mesoscale Model", Journal of Geophysical Research-Atmospheres, February 2005, Vol.110,, doi: 10.1029/2004JD005160

We applied a coupled climate (MM5) and land-surface (LSM1) model to examine the effects of early and late winter wheat harvest on regional climate in the Department of Energy Atmospheric Radiation Measurement (ARM) Climate Research Facility in the Southern Great Plains, where winter wheat accounts for 20% of the land area.

2004

Yun He, Status of Single-Executable CCSM Development, Climate Change Prediction Program (CCPP) Meeting, October 2004,

Yun He, MPH: a Library for Coupling Multi-Component Models on Distributed Memory Architectures and its Applications, Scientific Computing Seminar, Lawrence Berkeley National Laboratory, October 2004,

Chris Ding, Yun He, "Integrating Program Component Executables on Distributed Memory Architectures via MPH", Proceedings of International Parallel and Distributed Processing Symposium, April 2004,

Yun He and Chris H.Q. Ding, "MPI and OpenMP Paradigms on Cluster of SMP Architectures: The Vacancy Tracking Algorithm for Multi-dimensional Array Transposition", Journal of Parallel and Distributed Computing Practice, 2004, Issue 5,,

We evaluate remapping multi-dimensional arrays on cluster of SMP architectures under OpenMP, MPI, and hybrid paradigms. Traditional method of multi-dimensional array transpose needs an auxiliary array of the same size and a copy back stage. We recently developed an in-place method using vacancy tracking cycles. The vacancy tracking algorithm outperforms the traditional 2-array method as demonstrated by extensive comparisons. Performance of multi-threaded parallelism using OpenMP are first tested with different scheduling methods and different number of threads. Both methods are then parallelized using several parallel paradigms. At node level, pure OpenMP outperforms pure MPI by a factor of 2.76 for vacancy tracking method. Across entire cluster of SMP nodes, by carefully choosing thread numbers, the hybrid MPI/OpenMP implementation outperforms pure MPI by a factor of 3.79 for traditional method and 4.44 for vacancy tracking method, demonstrating the validity of the parallel paradigm of mixing MPI with OpenMP.

 

2003

Yun He and Chris Ding, "MPH: a Library for Coupling Multi-Component Models on Distributed Memory Architectures", SuperComputing 2003, November 2003,

W.J. Riley, H.S. Cooley, Y. He, and M.S. Torn, "Coupling MM5 with ISOLSM: Development, Testing, and Applications", Thirteenth PSU/NCAR Mesoscale Modeling System Users' Workshop, June 10, 2003, LBNL 53018,

W.J. Riley, H.S. Cooley, Y. He*, and M.S. Torn, Coupling MM5 with ISOLSM: Development, Testing, and Applications, Thirteenth PSU/NCAR Mesoscale Modeling System Users' Workshop, NCAR, June 2003,

Y. He and C. H.Q. Ding, "An Evaluation of MPI and OpenMP Paradigms for Multi-Dimensional Data Remapping", Lecture Notes in Computer Science, Vol 2716., edited by M.J. Voss, ( June 2003) Pages: 195-210

Y. He and C. Ding, "Multi-Program Multi Program-Components Handshaking (MPH) Utility Version 4 User's Manual", May 2003, LBNL 50778,

Helen He, Hybrid MPI and OpenMP Programming on the SP, NERSC User Group (NUG) Meeting, Argonne National Lab, May 2003,

Helen He, Hybrid OpenMP and MPI Programming on the SP: Successes, Failures, and Results, NERSC User Training 2003, Lawrence Berkeley National Laboratory, March 2003,

2002

Yun He, Chris H.Q. Ding, "MPI and OpenMP paradigms on cluster of SMP architectures: the vacancy tracking algorithm for multi-dimensional array transposition", Proceedings of the 2002 ACM/IEEE conference on Supercomputing, November 2002,

Yun He, Chris H.Q. Ding, MPI and OpenMP Paradigms on Cluster of SMP Architectures: the Vacancy Tracking Algorithm for Multi-Dimensional Array Transpose, SuperComputing 2002, November 2002,

Chris Ding and Yun He, "Climate Modeling: Coupling Component Models by MPH for Distributed Multi-Component Environment", Proceedings of the Tenth Workshop on the Use of High Performance Computing in Meteorology, World Scientific Publishing Company, Incorporated, November 2002, 219-234,

C. H.Q. Ding and Y. He*, Effective Methods in Reducing Communication Overheads in Solving PDE Problems on Distributed-Memory Computer Architectures, Grace Hopper Celebration of Women in Computing 2002, October 2002,

Yun He, Chris H.Q. Ding, MPI and OpenMP Paradigms on Cluster of SMP Architectures: the Vacancy Tracking Algorithm for Multi-Dimensional Array Transpose, WOMPAT 2002: Workshop on OpenMP Applications and Tools, University of Alaska, August 2002,

2001

C. H.Q. Ding and Y. He, "A Ghost Cell Expansion Method for Reducing Communications in Solving PDE Problems", Proceedings of SuperComputing 2001 Conference, November 2001, LBNL 47929,

C. H.Q. Ding and Y. He, "MPH: a Library for Distributed Multi-Component Environment", May 2001, LBNL 47930,

Y. He and C. H.Q. Ding, "Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications", Journal of Supercomputing, vol.18, March 2001, 18:259-277,

2000

Y. He and C. H.Q. Ding, "Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications", Proceedings of the Ninth Workshop on the Use of High Performance Computing in Meteorology: Developments in Teracomputing, November 2000, 296-317,

Y. He, C. H.Q. Ding, Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications, the Ninth Workshop on the Use of High Performance Computing in Meteorology: Developments in Teracomputing, European Centre for Medium-Range Weather Forecasts, 2000,

Yun He, Ti-Petsc: Integrating Titanium with PETSc, Invited talk at A Workshop on the ACTS Toolkit: How can ACTS work for you? Lawrence Berkeley National Laboratory, September 2000,

Yun He, Computational Ocean Modeling, Invited talk, Computer Science Graduate Fellow (CSGF) Workshop, Lawrence Berkeley National Laboratory, July 2000,

Y. He, C. H.Q. Ding, Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications, International Conference on Supercomputing (ICS'00), May 2000,

X.-H. Yan, Y. He, R. D. Susanto, and W. T. Liu, "Multisensor Studies on El Nino-Southern Oscillations and Variabilities in Equatorial Pacific", J. of Adv. Marine Sciences and Tech. Society, 4(2), 2000, 4(2):289-301,

1999

C. H.Q. Ding and Y. He, "Data Organization and I/O in a Parallel Ocean Circulation Model", Proceedings of Supercomputing 1999 Conference, November 1999, LBNL 43384,

Yun He, Computational Aspects of Modular Ocean Model Development, invited talk at Jet Propulsion Laboratory, April 1, 1999,

1998

Yun He, Correlation Analyses of Scatterometer Wind, Altimeter Sea Level and SST Data for the Tropical Pacific Ocean, American Geophysical Union, 1998 Spring Meeting, May 1998,

1997

Y. He, X.-H. Yan, and W. T. Liu, "Surface Heat Fluxes in the Western Equatorial Pacific Ocean Estimated by an Inverse Mixed Layer Model and by Bulk Parameterization", Journal of Physical Oceanography, Vol.27, No.11, November 1997, Vol.27, :2477-2487,

Yun He, El Nino 1997, 1997 Coast Day, College of Marine Studies, University of Delaware, October 1, 1997,

X.-H. Yan, Y. He, W. T. Liu, Q. Zheng, and C.-R. Ho, "Centroid Motion of the Western Pacific Warm Pool in the Recent Three El Nino Events,", Journal of Physical Oceanography, Vol.27, No.5, May 1997, Vol.27, :837-845,

1996

Yun He, Estimation of Surface Net Heat Flux in the Western Tropical Pacific Using TOPEX/Poseidon Altimeter Data, American Geophysical Union, 1996 Spring Meeting, May 1, 1996,

Jason Hick

2015

J. Hick, Future Directions and How SPXXL Can Help, SPXXL Summer 2015, May 21, 2015,

Discussion of how NERSC may look in 2020, some challenges to getting there, and a proposal for how the SPXXL user group can help.

J. Hick, R. Lee, R. Cheema, K. Fagnan, GPFS for Life Sciences at NERSC, GPFS User Group Meeting, May 20, 2015,

A report showing both high and low-level changes made to our life sciences workloads to support them on GPFS file systems.

2014

Richard A. Gerber et al., "High Performance Computing Operational Review: Enabling Data-Driven Scientific Discovery at DOE HPC Facilities", November 7, 2014,

J. Hick, Scalability Challenges in Large-Scale Tape Environments, IEEE Mass Storage Systems & Technologies 2014, June 4, 2014,

Provides an overview of NERSC storage systems and focuses on challenges we experience with HPSS at NERSC and with the tape industry.

S. Parete-Koon, B. Caldwell, S. Canon, E. Dart, J. Hick, J. Hill, C. Layton, D. Pelfrey, G. Shipman, D. Skinner, J. Wells, J. Zurawski, "HPC's Pivot to Data", Conference, May 5, 2014,

 

Computer centers such as NERSC and OLCF have traditionally focused on delivering computational capability that enables breakthrough innovation in a wide range of science domains. Accessing that computational power has required services and tools to move the data from input and output to computation and storage. A pivot to data is occurring in HPC. Data transfer tools and services that were previously peripheral are becoming integral to scientific workflows.  Emerging requirements from high-bandwidth detectors, highthroughput screening techniques, highly concurrent simulations, increased focus on uncertainty quantification, and an emerging open-data policy posture toward published research are among the data-drivers shaping the networks, file systems, databases, and overall HPC environment. In this paper we explain the pivot to data in HPC through user requirements and the changing resources provided by HPC with particular focus on data movement. For WAN data transfers we present the results of a study of network performance between centers

 

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

Jason Hick, NERSC, Storage Systems: 2014 and beyond, February 6, 2014,

2013

J. Hick, A Storage Outlook for Energy Sciences: Data Intensive, Throughput and Exascale Computing, FujiFilm Executive IT Summit 2013, October 24, 2013,

Provides an overview of the computational and storage systems at NERSC.  Discusses the major types of computation scientists conduct at the facility, the challenges and opportunities the storage systems will face in the near future, and the role of tape technology at the Center.

J. Hick, Storage at a Distance, Open Fabrics Alliance User Day 2013, April 19, 2013,

Presentation to generate discussion on current state-of-the-practice for the topic of storage at a distance and synergy with Open Fabrics Alliance users.

J. Hick, GPFS at NERSC/LBNL, SPXXL Winter 2013, January 7, 2013,

A report to SPXXL conference participants on state of the NERSC Global File System architecture, achievements and directions.

2012

Z. Liu, M. Veeraraghavan, Z. Yan, C. Tracyy, J. Tiez, I. Fosterz, J. Dennisx, J. Hick, Y. Lik and W. Yang, "On using virtual circuits for GridFTP transfers", Conference, November 12, 2012,

The goal of this work is to characterize scientific data transfers and to determine the suitability of dynamic virtual circuit service for these transfers instead of the currently used IP-routed service. Specifically, logs collected by servers executing a commonly used scientific data transfer application, GridFTP, are obtained from three US super-computing/scientific research centers, NERSC, SLAC, and NCAR, and analyzed. Dynamic virtual circuit (VC) service, a relatively new offering from providers such as ESnet and Internet2, allows for the selection of a path on which a rate-guaranteed connection is established prior to data transfer. Given VC setup overhead, the first analysis of the GridFTP transfer logs characterizes the duration of sessions, where a session consists of multiple back-to-back transfers executed in batch mode between the same two GridFTP servers. Of the NCAR-NICS sessions analyzed, 56% of all sessions (90% of all transfers) would have been long enough to be served with dynamic VC service. An analysis of transfer logs across four paths, NCAR-NICS, SLAC-BNL, NERSC-ORNL and NERSC-ANL, shows significant throughput variance, where NICS, BNL, ORNL, and ANL are other US national laboratories. For example, on the NERSC-ORNL path, the inter-quartile range was 695 Mbps, with a maximum value of 3.64 Gbps and a minimum value of 758 Mbps. An analysis of the impact of various factors that are potential causes of this variance is also presented.

Damian Hazen, Jason Hick, "MIR Performance Analysis", June 12, 2012, LBNL LBNL-5896E,

 

We provide analysis of Oracle StorageTek T10000 Generation B (T10KB) Media Information Record (MIR) Per- formance Data gathered over the course of a year from our production High Performance Storage System (HPSS). The analysis shows information in the MIR may be used to improve tape subsystem operations. Most notably, we found the MIR information to be helpful in determining whether the drive or tape was most suspect given a read or write error, and for helping identify which tapes should not be reused given their history of read or write errors. We also explored using the MIR Assisted Search to order file retrieval requests. We found that MIR Assisted Search may be used to reduce the time needed to retrieve collections of files from a tape volume. 

 

N. Balthaser, J. Hick, W. Hurlbert, StorageTek Tape Analytics: Pre-Release Evaluation at LBNL, LTUG 2012, April 25, 2012,

A report to the Large Tape Users Group (LTUG) annual conference on a pre-release evaluation of the new software product, StorageTek Tape Analytics (STA).  We provide a user's perspective on what we found useful, some suggestions for improvement, and some key new features that would enhance the product.

"NERSC Exceeds Reliability Standards With Tape-Based Active Archive", Active Archive Alliance Case Study, February 10, 2012,

J. Hick, NERSC Site Update (NGF), SPXXL Winter 2012, January 10, 2012,

Update to NERSC Global File (NGF) System, based on IBM's GPFS, to the SPXXL User Group community.  Includes an overview of NERSC, the file systems that comprise NGF, some of our experiences with GPFS, and recommendations for improving scalability.

2011

M. Cary, J. Hick, A. Powers, HPC Archive Solutions Made Simple, Half-day Tutorial at Super Computing (SC11), November 13, 2011,

Half-day tutorial at SC11 where attendees were provided detailed information about HPC archival storage systems for general education.  The tutorial was the first SC tutorial to cover the topic of archival storage and helped sites to understand the characteristics of these systems, the terminology for archives, and how to plan, size and manage these systems.

J. Hick, Digital Archiving and Preservation in Government Departments and Agencies, Oracle Open World 2011, October 6, 2011,

Attendees of this invited talk at Oracle Open World 2011 heard about the NERSC Storage Systems Group and the HPSS Archive and Backup systems we manage.  Includes information on why we use disk and tape to store data, and an introduction to the Large Tape Users Group (LTUG).

J. Hick, J. Hules, A. Uselton, "DOE HPC Best Practices Workshop: File Systems and Archives", Workshop, September 27, 2011,

The Department of Energy has identified the design, implementation, and usability of file systems and archives as key issues for current and future HPC systems. This workshop addresses current best practices for the procurement, operation, and usability of file systems and archives. Furthermore, the workshop addresses whether system challenges can be met by evolving current practices.

J. Hick, The NERSC Global Filesystem (NGF), Computing in Atmospheric Sciences 2011 (CAS2K11), September 13, 2011,

Provides the Computing in Atmospheric Sciences 2011 conference attendees an overview and configuration details of the NERSC Global Filesystem (NGF).  Includes a few lessons learned and future directions for NGF.

J. Hick, M. Andrews, Leveraging the Business Value of Tape, FujiFilm Executive IT Summit 2011, June 9, 2011,

Describes how tape is used in the HPSS Archive and HPSS Backup systems at NERSC.  Includes some examples of our organizations tape policies, our roadmap to Exascale and an example of tape in the Exascale Era, our observed tape reliability, and an overview of our locally developed Parallel Incremental Backup System (PIBS) which performs backups of our NGF file system.

J. Hick, Storage Supporting DOE Science, Preservation and Archiving Special Interest Group (PASIG) 2011, May 12, 2011,

Provided attendees of the Preservation and Archiving Special Interest Group conference attendees with an overview of NERSC, the Storage Systems Group, and the HPSS Archives and NGF File Systems we support.  Includes some information on a large tape data migration and our observations on the reliability of tape at NERSC.

D. Hazen, J. Hick, W. Hurlbert, M. Welcome, Media Information Record (MIR) Analysis, LTUG 2011, April 19, 2011,

Presentation of Storage Systems Group findings from a year-long effort to collect and analyze Media Information Record (MIR) statistics from our in-production Oracle enterprise tape drives at NERSC.  We provide information on the data collected, and some highlights from our analysis. The presentation is primarily intended to declare that the information in the MIR is important to users or customers to better operating and managing their tape environments.

J. Hick, I/O Requirements for Exascale, Open Fabrics Alliance 2011, April 4, 2011,

This talk provides an overview of the DOE Exascale effort, high level IO requirements, and an example of exascale era tape storage.

2010

Neal Master, Matthew Andrews, Jason Hick, Shane Canon, Nicholas J. Wright, "Performance Analysis of Commodity and Enterprise Class Flash Devices", Petascale Data Storage Workshop (PDSW), November 2010,

"Re-thinking data strategies is critical to keeping up", J. Hick, HPC Source Magazine, June 1, 2010,

D. Hazen, J. Hick, HPSS v8 Metadata Conversion, HPSS 8.1 Pre-Design Meeting, April 7, 2010,

Provided information about the HPSS metadata conversion software to other developers of HPSS.  Input was important to establishing a design for the version 8 HPSS metadata conversions.

Sim A., Gunter D., Natarajan V., Shoshani A., Williams D., Long J., Hick J., Lee J., Dart E., "Efficient Bulk Data Replication for the Earth System Grid", Data Driven E-science: Use Cases and Successful Applications of Distributed Computing Infrastructures (Isgc 2010), Springer-Verlag New York Inc, 2010, 435,

D. Cook, J. Hick, J. Minton, H. Newman, T. Preston, G. Rich, C. Scott, J. Shoopman, J. Noe, J. O'Connell, G. Shipman, D. Watson, V. White, "HPSS in the Extreme Scale Era: Report to DOE Office of Science on HPSS in 2018–2022", Lawrence Berkeley National Laboratory technical report LBNL-3877E, 2010, LBNL 3877E,

Kettimuthu Raj, Sim Alex, Gunter Dan, Allcock Bill, Bremer Peer T., Bresnahan John, Cherry Andrew, Childers Lisa, Dart Eli, Foster Ian, Harms Kevin, Hick Jason, Lee Jason, Link Michael, Long Jeff, Miller Keith, Natarajan Vijaya, Pascucci Valerio, Raffenetti Ken, Ressman David, Williams Dean, Wilson Loren, Winkler Linda, "Lessons Learned from Moving Earth System Grid Data Sets over a 20 Gbps Wide-Area Network", Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing HPDC 10, New York NY USA, 2010, 316--319,

A. Sim, D. Gunter, V. Natarajan, A. Shoshani, D. Williams, J. Long, J. Hick, J. Lee, E. Dart, "Efficient Bulk Data Replication for the Earth System Grid", International Symposium on Grid Computing, 2010,

2009

A. Shoshani, D. Rotem, Scientific Data Management: Challenges, Technology, and Deployment, Book, (December 16, 2009)

This book provides a comprehensive understanding of the latest techniques for managing data during scientific exploration processes, from data generation to data analysis.

J. Hick, Sun StorageTek Tape Hardware Migration Experiences, LTUG 2009, April 24, 2009,

Talk addresses specific experiences and lessons learned in migrating our entire HPSS archive from StorageTek 9310 Powderhorns using 9840A, 9940B, and T10KA tape drives to StorageTek SL8500 Libraries using 9840D and T10KB tape drives.

W. Allcock, R. Carlson, S. Cotter, E. Dart, V. Dattoria, B. Draney, R. Gerber, M. Helm, J. Hick, S. Hicks, S. Klasky, M. Livny, B. Maccabe, C. Morgan, S. Morss, L. Nowell, D. Petravick, J. Rogers, Y. Sekine, A. Sim, B. Tierney, S. Turnbull, D. Williams, L. Winkler, F. Wuerthwein, "ASCR Science Network Requirements", Workshop, April 15, 2009,

ESnet publishes reports from Network and Science Requirement Workshops on a regular basis.  This report was the product of a two-day workshop in Washington DC that addresses science requirements impacting operations of networking for 2009.

2008

A. Mokhtarani, W. Kramer, J. Hick, "Reliability Results of NERSC Systems", Web site, August 28, 2008,

In order to address the needs of future scientific applications for storing and accessing large amounts of data in
an efficient way, one needs to understand the limitations of current technologies and how they may cause system
instability or unavailability. A number of factors can impact system availability ranging from facility-wide
power outage to a single point of failure such as network switches or global file systems. In addition, individual
component failure in a system can degrade the performance of that system. This paper focuses on analyzing both
of these factors and their impacts on the computational and storage systems at NERSC. Component failure data
presented in this report primarily focuses on disk drive in on of the computational system and tape drive failure
in HPSS. NERSC collected available component failure data and system-wide outages for its computational and
storage systems over a six-year period and made them available to the HPC community through the Petascale
Data Storage Institute.

Eric L. Hjort

2012

Eric Hjort, Larry Pezzaglia, Iwona Sakrejda, PDSF at NERSC: Site Report, A talk at the HEPiX Spring 2012 Workshop, Prague, Czech Republic, April 24, 2012,

PDSF is a commodity Linux cluster at NERSC which has been in continuous operation since 1996. This talk will provide a status update on the PDSF system and summarize recent changes at NERSC. Highlighted PDSF changes include the conversion to xCAT-managed netboot node images, the ongoing deployment of Scientific Linux 6, and the introduction of XRootD for STAR.

Wayne E. Hurlbert

2013

N. Balthaser, W. Hurlbert, T10KC Technology in Production, May 9, 2013,

Report to 2012  Large Tape User Group meeting regarding our production statistics and experiences using the Oracle T10000C tape drive.

2012

N. Balthaser, J. Hick, W. Hurlbert, StorageTek Tape Analytics: Pre-Release Evaluation at LBNL, LTUG 2012, April 25, 2012,

A report to the Large Tape Users Group (LTUG) annual conference on a pre-release evaluation of the new software product, StorageTek Tape Analytics (STA).  We provide a user's perspective on what we found useful, some suggestions for improvement, and some key new features that would enhance the product.

2011

J. Hick, J. Hules, A. Uselton, "DOE HPC Best Practices Workshop: File Systems and Archives", Workshop, September 27, 2011,

The Department of Energy has identified the design, implementation, and usability of file systems and archives as key issues for current and future HPC systems. This workshop addresses current best practices for the procurement, operation, and usability of file systems and archives. Furthermore, the workshop addresses whether system challenges can be met by evolving current practices.

D. Hazen, J. Hick, W. Hurlbert, M. Welcome, Media Information Record (MIR) Analysis, LTUG 2011, April 19, 2011,

Presentation of Storage Systems Group findings from a year-long effort to collect and analyze Media Information Record (MIR) statistics from our in-production Oracle enterprise tape drives at NERSC.  We provide information on the data collected, and some highlights from our analysis. The presentation is primarily intended to declare that the information in the MIR is important to users or customers to better operating and managing their tape environments.

Doug Jacobsen

2016

Shane Canon, Doug Jacobsen, "Shifter: Containers for HPC", Cray User Group, London, England, May 13, 2016,

Container-based computed is rapidly changing the way software is developed, tested, and deployed. This paper builds on previously presented work on a prototype framework for running containers on HPC platforms. We will present a detailed overview of the design and implementation of Shifter, which in partnership with Cray has extended on the early prototype concepts and is now in production at NERSC. Shifter enables end users to execute containers using images constructed from various methods including the popular Docker-based ecosystem. We will discuss some of the improvements over the initial prototype including an improved image manager, integration with SLURM, integration with the burst buffer, and user controllable volume mounts. In addition, we will discuss lessons learned, performance results, and real-world use cases of Shifter in action. We will also discuss the potential role of containers in scientific and technical computing including how they complement the scientific process. We will conclude with a discussion about the future directions of Shifter.

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Douglas M. Jacobsen, James F. Botts, and Yun (Helen) He, "SLURM. Our Way.", Cray User Group Meeting 2016, London, England, May 2016,

Douglas M. Jacobsen, James F. Botts, and Yun (Helen) He, SLURM. Our Way., Cray User Group Meeting 2016. London, England., May 12, 2016,

2015

Doug Jacobsen, Shane Canon, Contain This, Unleashing Docker for HPC, NERSC Webcast, May 15, 2015,

Doug Jacobsen, Shane Canon, "Contain This, Unleashing Docker for HPC", Cray User Group 2015, April 23, 2015,

Doug Jacobsen, "procmon: Real-time process monitoring on the Cray XC-30", Cray User Group 2015, April 21, 2015,

2014

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

Yushu Yao, NERSC; Douglas Jacobsen, NERSC, Connecting to NERSC, NUG 2014, February 3, 2014,

Kristy Kallback-Rose

2012

K Kallback-Rose, D Antolovic, R Ping, K Seiffert, C Stewart, T Miller, "Conducting K-12 Outreach to Evoke Early Interest in IT, Science, and Advanced Technology", ACM, July 16, 2012,

This is a preprint of a paper presented at XSEDE '12: The 1st Conference of the Extreme Science and Engineering Discovery Environment, Chicago, Illinois.

Jihan Kim

2012

Allison Dzubak, Li-Chiang Lin, Jihan Kim, Joseph Swisher, Roberta Poloni, Sergei Maximoff, Berend Smit, Laura Gagliardi, "Ab initio Carbon Capture in Open-Site Metal Organic Frameworks", Nature Chemistry, 2012,

Jihan Kim, Richard Martin, Oliver Ruebel, Maciej Haranczyk, Berend Smit, "High-throughput Characterization of Porous Materials Using Graphics Processing Units", Journal of Chemical Theory and Computation, 2012,

Jihan Kim, Li-Chiang Lin, Richard Martin, Joseph Swisher, Maciej Haranczyk, Berend Smit, "Large Scale Computational Screening of Zeolites for Ethene/Ethane Separation", Langmuir, 2012,

Richard Martin, Thomas Willems, Li-Chiang Lin, Jihan Kim, Joseph Swisher, Berend Smit, Maciej Harancyzk, "Similarity-driven Discovery of Porous Materials for Adsorption-based Separations", In Preparation, 2012,

Li-Chiang Lin, Adam Berger, Richard Martin, Jihan Kim (co-first author), Joseph Swisher, Kuldeep Jariwala, Chris Rycroft, Abhoyjit Brown, Michael Deem, Maciej Haranczyk, Berend Smit, "In Silico Screening of Carbon Capture Materials", Nature Materials, 2012,

Jihan Kim, Berend Smit, "Efficient Monte Carlo Simulations of Gas Molecules Inside Porous Materials", Journal of Chemical Theory and Computation, 2012,

J. Kim, A. Koniges, R.L. Martin, J. Swisher, M. Haranczyk, B. Smit, Computational Screening of Novel Carbon Capture Materials, 2012 GTC GPU Conference, 2012,

2011

Jihan Kim, Alice Koniges, Richard Martin, Maciej Harancyzk, Joseph Swisher, Berend Smit, "GPU Computational Screening of Carbon Capture Materials - Paper", Proceedings of the 2011 SciDAC Conference, 2011,

Jihan Kim, Jocelyn Rodgers, Manuel Athenes, Berend Smit, "Molecular Monte Carlo Simulations Using Graphics Processing Units: To Waste Recycle or Not?", Journal of Chemical Theory and Computation, 2011,

J. Kim, A. Koniges, R.L Martin, M. Haranczyk, J. Swisher, B. Smit, "GPU Computational Screening of Carbon Capture Materials", 2011 SciDAC Conference, Denver, CO, 2011,

2010

J. Kim, A. Koniges, B. Smit, M. Head-Gordon, "Calculation of RI-MP2 Gradient using Fermi GPUs", Molecular Quantum Mechanics 2010, May 2010,

Alice Koniges, Robert Preissl, Jihan Kim, D Eder, A Fisher, N Masters, V Mlaker, S Ethier, W Wang, M Head-Gordon, N Wichmann, "Application acceleration on current and future Cray platforms,", Proc. Cray User Group Meeting, Edinburgh, Scotland, May 2010,

Quincey Koziol

2016

Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon,
Alex Gittens, Lisa Gerhardt, Suren Byna, Mike F. Ringenburg, Prabhat,
"H5Spark: Bridging the I/O Gap between Spark and Scientific Data Formats on HPC Systems", Cray User Group, May 13, 2016,

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

2014

Quincey Koziol, Ruth Aydt, Russ Rew, Mark Howison, Mark Miller, Prabhat, "HDF5", Book Chapter in High Performance Parallel I/O, Prabhat, Quincey editors, CRC Press., ( October 23, 2014)

M. Scot Breitenfeld, Kalyana Chadalavada, Robert Sisneros, Suren Byna, Quincey Koziol, Neil Fortner, Prabhat, Venkat Vishwanath, "Tuning Performance of Large scale I/O with Parallel HDF5", SC’14 PDSW Workshop, October 15, 2014,

2013

Babak Behzad, Huong Luu, Joey Huchette, Suren Byna, Prabhat, Ruth Aydt, Quincey Koziol, Marc Snir, "Taming Parallel I/O Complexity with Auto-Tuning", SuperComputing 2013, October 9, 2013,

2012

Prabhat, Suren Byna, Kesheng Wu, Jerry Chou, Mark Howison, Joey Huchette, Wes Bethel, Quincey Koziol, Mohammad Chaarawi, Ruth Aydt, Babak Behzad, Huong Luu, Karen Schuchardt, Bruce Palmer, "Updates from the ExaHDF5 project: Trillion particle run, Auto-Tuning and the Virtual Object Layer", DOE Exascale Research Conference, 2012,

Harinarayan Krishnan

2013

David Camp, Hari Krishnan, David Pugmire, Christoph Garth, Ian Johnson, E. Wes Bethel, Kenneth I. Joy, Hank Childs, "GPU Acceleration of Particle Advection Workloads in a Parallel, Distributed Memory Setting", Proceedings of Eurographics Symposium on Parallel Graphics and Visualization (EGPGV), May 5, 2013,

Dean N. Williams, Timo Bremer, Charles Doutriaux, John Patchett, Galen Shipman, Blake Haugen, Ross Miller, Brian Smith, Chad Steed, E. Wes Bethel, Hank Childs, Harinarayan Krishnan, Prabhat, Michael Wehner, Claudio T. Silva, Emanuele Santos, David Koop, Tommy Ellqvist, Huy T. Vo, Jorge Poco, Berk Geveci, Aashish Chaudhary, Andrew Bauer, Alexander Pletzer, Dave Kindig, Gerald L. Potter, Thomas P. Maxwell, "The Ultra-scale Visualization Climate Data Analysis Tools: Data Analysis and Visualization for Geoscience Data", IEEE Special Issue: Cutting-Edge Research in Visualization, 2013,

2012

E. Wes Bethel, David Camp, Hank Childs, Mark Howison, Hari Krishnan, Burlen Loring, J\ org Meyer, Prabhat, Oliver R\ ubel, Daniela Ushizima, Gunther Weber, "Towards Exascale: High Performance Visualization and Analytics -- Project Status Report", 2012,

Thorsten Kurth

2016

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, Volume 9945, October 6, 2016, doi: 10.1007/978-3-319-46079-6_24

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, OpenMP Parallelization and Optimization of Graph-based Machine Learning Algorithms, IWOMP 2016, October 6, 2016,

Thorsten Kurth, Balint Joo, Dhiraj Kalamkar, Karthikeyan Vaidyanathan, Aaron Walden, "Optimizing Wilson-Dirac operator and linear solvers for Intel KNL", Springer Lecture Notes in Computer Science, October 6, 2016,

Tareq Malas, Thorsten Kurth, Jack Deslippe, "Optimization of the sparse matrix-vector products of an IDR Krylov iterative solver in EMGeo for the Intel KNL manycore processor", Springer Lecture Notes in Computer Science, October 6, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, MPI usage at NERSC: Present and Future, EuroMPI 2016, September 26, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, "MPI usage at NERSC: Present and Future", EuroMPI 2016, Edinburgh, Scotland, UK, September 26, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, "MPI usage at NERSC: Present and Future", EuroMPI 2016, September 26, 2016,

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, "OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms", Lecture Notes in Computer Science, Springer, 2016, 9903:17-31, doi: 10.1007/978-3-319-45550-1_2

Amy Nicholson, Evan Berkowitz, Chia Cheng Chang, Kate Clark, Balint Joo, Thorsten Kurth, Enrico Rinaldi, Brian Tiburzi, Pavlos Varnas, Andre Walker-Loud, "Neutrinoless double beta decay from Lattice QCD", PoSLAT, August 16, 2016,

2015

Amy Nicholson, Evan Berkowitz, Enrico Rinaldi, Pavlos Vranas, Thorsten Kurth, Balint Joo, Mark Strother, Andre Walker-Loud, "Two-nucleon scattering in multiple partial waves", Conference, PoSLAT, November 6, 2015,

Thorsten Kurth, Evan Berkowitz, Enrico Rinaldi, Pavlos Vranas, Amy Nicholson, Mark Strother, Andre Walker-Loud, "Nuclear Parity Violation from Lattice QCD", Conference, PoSLAT, November 6, 2015,

Thorsten Kurth, Andrew Pochinsky, Abhinav Sarje, Sergey Syritsyn, Andre Walker-Loud, "High-Performance I/O: HDF5 for Lattice QCD", Conference, PoSLAT, January 18, 2015,

2014

Thorsten Kurth, Noriyoshi Ishii, Takumi Doi, Sinya Aoki, Tetsuo Hatsuda, "Phase shifts in I=2 ππ-scattering from two lattice approaches", Conference, November 2014,

2013

Stephan Dürr, Zoltán Fodor, Christian Hoelbling, Stefan Krieg, Thorsten Kurth, Laurent Lellouch, Thomas Lippert, Rehan Malak, Thibaut Métivet, Antonin Portelli, Alfonso Sastre, Kálmán Szabó, "Lattice QCD at the physical point meets SU(2) chiral perturbation theory", Journal, October 14, 2013,

Sz. Borsanyi, S. Dürr, Z. Fodor, J. Frison, C. Hoelbling, S.D. Katz, S. Krieg, Th. Kurth, L. Lellouch, Th. Lippert, A. Portelli, A. Ramos, A. Sastre, K. Szabo, "Isospin splittings in the light baryon octet from lattice QCD and QED", Journal, June 10, 2013,

T. Kurth, N. Ishii, T. Doi, S. Aoki, T. Hatsuda, "Phase shifts in I=2ππ-scattering from two lattice approaches", Journal, May 20, 2013,

2012

Szabolcs Borsanyi, Stephan Durr, Zoltan Fodor, Christian Hoelbling, Sandor D. Katz, Stefan Krieg, Thorsten Kurth, Laurent Lellouch, Thomas Lippert, Craig McNeile, Kalman K. Szabo, "High-precision scale setting in lattice QCD", Journal, March 1, 2012,

Antonin Portelli, Stephan Durr, Zoltan Fodor, Julien Frison, Christian Hoelbling, Sandor D. Katz, Stefan Krieg, Thorsten Kurth, Laurent Lellouch, Thomas Lippert Alberto Ramos, Kalman K. Szabo, "Systematic errors in partially-quenched QCD plus QED lattice simulations", Conference, PoSLAT, January 2012,

2011

S. Durr, Z. Fodor, T. Hemmert, C. Hoelbling, J. Frison, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, A. Portelli, A. Ramos, A. Schafer, K.K. Szabo, "Sigma term and strangeness content of octet baryons", Journal, September 1, 2011,

S. Durr, Z. Fodor, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, C. McNeile, A. Portelli, K.K. Szabo, "Kaon bag parameter B(K) at the physical mass point", Conference, PosLAT, August 2011,

G. Bali, S. Collins, S. Durr, Z. Fodor, R. Horsley, C. Hoelbling, S.D. Katz, I. Kanamori, S. Krieg, T. Kurth L. Lellouch, T. Lippert, C. McNeile, Y. Nakamura, D. Pleiter, P. Perez-Rubio, P. Rakow, A. Schafer, G.Schierholz K.K. Szabo, F. Winter, J. Zanotti, "Spectra of heavy-light and heavy-heavy mesons containing charm quarks, including higher spin states for Nf=2+1", Conference, PoSLAT, August 2011,

S. Durr, Z. Fodor, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, C. McNeile, A. Portelli, K.K. Szabo, "Precision computation of the kaon bag parameter", Journal, June 1, 2011,

2010

S. Durr, Z. Fodor, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth,L. Lellouch, T. Lippert, K.K. Szabo, G. Vulvert, "Lattice QCD at the physical point: Simulation and analysis details", Journal, November 2010,

S. Durr, Z. Fodor, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert,K.K. Szabo, G. Vulvert, "Lattice QCD at the physical point: light quark masses", Journal, November 2010,

A. Ramos, S. Durr, Z. Fodor, J. Frison, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, A. Portelli, K.K. Szabo, "Decayconstantsandsigmatermsfromthela ce", Conference, PoSICHEP, November 2010,

A. Portelli, S. Durr, Z. Fodor, J. Frison, C. Holbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, K.K. Szabo, A. Ramos, "Electromagnetic corrections to light hadron masses", Conference, November 2010,

J. Frison, S. Durr, Z. Fodor, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, A. Portelli, A. Ramos, K.K. Szabo, "Rho decay width from the lattice", Conference, PoSLAT, November 2010,

J. Frison, S. Durr, Z. Fodor, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, A. Portelli, A. Ramos, K.K. Szabo, "Scaling study for 2 HEX smeared fermions: hadron and quark masses", Conference, PoSLAT, November 2010,

S. Durr, Z. Fodor, J. Frison, T. Hemmert, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, A. Portelli, A. Ramos, A. Schafer, K.K. Szabo, "Sigma term and strangeness content of the nucleon", Conference, PoSLAT, September 2010,

S. Durr, Z. Fodor, C. Hoelbling, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, A. Ramos, K.K. Szabo, "The ratio FK/Fpi in QCD", Journal, January 2010,

2008

S. Durr, Z. Fodor, J. Frison, C. Hoelbling, R. Hoffmann,S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, K.K. Szabo, G. Vulvert, "Ab-Initio Determination of Light Hadron Masses", Journal, November 2008,

S. Dürr, Z. Fodor, C. Hoelbling, R. Hoffmann, S.D. Katz, S. Krieg, T. Kurth, L. Lellouch, T. Lippert, K.K. Szabo, G. Vulvert, "Scaling study of dynamical smeared-link clover fermions", Journal, February 2008,

2007

Stephan Durr, Zoltan Fodor, Christian Holbling, Thorsten Kurth, "Precision study of the SU(3) topological susceptibility in the continuum", Conference, PoSLAT, November 2007,

S. Durr, Z. Fodor, C. Hoelbling, S.D. Katz, S. Krieg, Th. Kurth, L. Lellouch, Th. Lippert, K.K. Szabo, G. Vulvert, "Mixed action simulations: Approaching physical quark masses", Conference, PoSLAT, October 2007,

S. Durr, Z. Fodor, C. hoelbling, S.D. katz, S. Krieg, Th. Kurth, L. Lellouch, Th. Lippert, K.K. Szabo, G. Vulvert, "Chiral behavior of pseudo-Goldstone boson masses and decay constants in 2+1 flavor QCD", Conference, PoSLAT, October 2007,

2006

Stephan Durr, Zoltan Fodor, Christian Hoelbling, Thorsten Kurth, "Precision study of the SU(3) topological susceptibility in the continuum", Journal, December 2006,

Rei Chi Lee

2015

J. Hick, R. Lee, R. Cheema, K. Fagnan, GPFS for Life Sciences at NERSC, GPFS User Group Meeting, May 20, 2015,

A report showing both high and low-level changes made to our life sciences workloads to support them on GPFS file systems.

2012

Zhengji Zhao, Mike Davis, Katie Antypas, Yushu Yao, Rei Lee and Tina Butler, "Shared Library Performance on Hopper", A paper presented in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,

Zhengji Zhao, Mike Davis, Katie Antypas, Yushu Yao, Rei Lee and Tina Butler, Shared Library Performance on Hopper, A talk in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,

2011

Zhengji Zhao, Mike Davis, Katie Antypas, Rei Lee and Tina Butler, Shared Library Performance on Hopper, Oct. 26, 2011, Cray Quarterly Meeting at St Paul, MN, October 26, 2011,

Paul T. Lin

2015

Paul T. Lin, Michael A. Heroux, Richard F. Barrett, Alan B. Williams, "Assessing a mini-application as a performance proxy for a finite element method engineering application", Concurrency and Computation: Practice and Experience, July 30, 2015, 27:5374–5389, doi: 10.1002/cpe.3587

Richard F. Barrett, Paul S. Crozier, Douglas W. Doerfler, Michael A. Heroux, Paul T. Lin, Heidi K. Thornquist, Timothy G. Trucano, Courtenay T. Vaughan, "Assessing the role of mini-applications in predicting key performance characteristics of scientific and engineering applications", Journal of Parallel and Distributed Computing, January 1, 2015, 75:107-122, doi: 10.1016/j.jpdc.2014.09.006

2014

S.S. Dosanjh, R.F. Barrett, D.W. Doerfler, S.D. Hammond, K.S. Hemmert, M.A. Heroux, P.T. Lin, K.T. Pedretti, A.F. Rodrigues, T.G. Trucano, J.P. Juitjens, "Exascale Design Space Exploration and Co-Design", Future Generation Computer Systems, Volume 30, Pages 46-58, January 2014,

Paul Lin, Matthew Bettencourt, Stefan Domino, Travis Fisher, Mark Hoemmen, Jonathan Hu, Eric Phipps, Andrey Prokopenko, Siva Rajamanickam, Christopher Siefert, Stephen Kennon, "Towards extreme-scale simulations for low Mach fluids with second-generation Trilinos", Parallel Processing Letters, January 1, 2014, 24:1-20, doi: 10.1142/S0129626414420055

2013

Michael A. Heroux, Richard Frederick Barrett, James Michael Willenbring, Daniel W Barnette, David Beckingsale, James F Belak, Mike Boulton, Paul Crozier, Douglas W. Doerfler, Harold C. Edwards, Wayne Gaudin, Timothy C Germann, Simon David Hammond, Andy Herdman, Stephen Jarvis, Paul Lin, Justin Luitjens, Andrew Mallinson, Simon McIntosh-Smith, Susan M Mniszewski, Jamaludin Mohd-Yusof, David F Richards, Christopher Sewell, Sriram Swaminarayan, Heidi K. Thornquist, Christian Robert Trott, Courtenay T. Vaughan, Alan B. Williams, R&D 100 Award, Mantevo Suite 1.0, R&D Magazine, August 2013,

2012

Mahesh Rajan, Douglas W. Doerfler, Paul T. Lin, Simon D. Hammond, Richard F. Barrett, Courtney T. Vaughan, "Unprecedented Scalability and Performance of the New NNSA Tri-Lab Linux Capacity Cluster 2", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12), November 2012,

Richard Barrett, Paul Crozier, Doug Doerfler, Simon Hammond, Mike Heroux, Paul Lin, Tim Trucano, Courtenay Vaughan, Alan Williams, "Assessing the Predictive Capabilities of Mini-applications", The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), November 2012,

Paul T. Lin, "Improving multigrid performance for unstructured mesh drift-diffusion simulations on 147,000 cores", International Journal for Numerical Methods in Engineering, May 30, 2012, 91:971-989, doi: 10.1002/nme.4315

2010

Paul T. Lin, John N. Shadid, "Towards large-scale multi-socket, multicore parallel simulations: Performance of an MPI-only semiconductor device simulator", Journal of Computational Physics, September 20, 2010, 229:6804-6818, doi: 10.1016/j.jcp.2010.05.023

Paul T. Lin, John N. Shadid, Raymond S. Tuminaro, Marzio Sala, Gary L. Hennigan, Roger P. Pawlowski, "A parallel fully coupled algebraic multilevel preconditioner applied to multiphysics PDE applications: drift-diffusion, flow/transport/reaction, resistive MHD", International Journal for Numerical Methods in Fluids, September 3, 2010, 64:1148-1179, doi: 10.1002/fld.2402

J. Tomkins, R. Brightwell, W. Camp, S. Dosanjh, S. Kelly, P. Lin, C. Vaughan, J. Levesque, V. Tipparaju, "The Red Storm Architecture and Early Experiences with Multi-Core Processors", International Journal of Distributed Systems and Technologies, Vol. 1, Issue 2, pp. 74-93, April 19, 2010, doi: 10.4018/jdst.2010040105

2009

Paul T. Lin, John N. Shadid, Marzio Sala, Raymond S. Tuminaro, Gary L. Hennigan, Robert J. Hoekstra, "Performance of a parallel algebraic multilevel preconditioner for stabilized finite element semiconductor device modeling", Journal of Computational Physics, September 20, 2009, 228:6250-6267, doi: 10.1016/j.jcp.2009.05.024

2006

Paul T. Lin, Marzio Sala, John N. Shadid, Raymond S. Tuminaro, "Performance of fully coupled algebraic multilevel domain decomposition preconditioners for incompressible flow and transport", International Journal for Numerical Methods in Engineering, July 9, 2006, 67:208-225, doi: 10.1002/nme.1624

Paul T. Lin, Timothy J. Baker, Luigi Martinelli, Antony Jameson, "Two-dimensional implicit time-dependent calculations on adaptive unstructured meshes with time evolving boundaries", International Journal for Numerical Methods in Fluids, January 20, 2006, 50:199-218, doi: 10.1002/fld.1050

Jialin Liu

2016

Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon,
Alex Gittens, Lisa Gerhardt, Suren Byna, Mike F. Ringenburg, Prabhat,
"H5Spark: Bridging the I/O Gap between Spark and Scientific Data Formats on HPC Systems", Cray User Group, May 13, 2016,

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

Mostofa Patwary, Nadathur Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey, "PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures", IPDPS 2016, April 5, 2016,

2015

Jialin Liu, Yu Zhuang, Yong Chen, "Hierarchical Collective I/O Scheduling for High-Performance Computing", Big Data Research, September 1, 2015,

Jialin Liu, Yong Chen, Surendra Byna, "Collective Computing for Scientific Big Data Analysis", 44th International Conference on Parallel Processing Workshops (ICPPW), September 1, 2015,

Jialin Liu, Yong Chen, "Segmented In-Advance Computing for Fast Scientific Discovery", Transactions on Cloud Computing, 2015,

WangYi Liu

2013

Alice Koniges, Wangyi Liu, John Barnard, Alex Friedman, Grant Logan, David Eder, Aaron Fisher, Nathan Masters, and Andrea Bertozzi, "Modeling warm dense matter experiments using the 3D ALE-AMR code and the move toward exascale computing", EPJ Web of Conferences 59, 09006, 2013,

D. Eder, D. Bailey, F. Chambers, I. Darnell, P. Di Nicola, S. Dixit, A. Fisher, G.Gururangan, D. Kalantar, A. Koniges, W. Liu, M. Marinak, N. Masters, V. Mlaker, R. Prasad, S. Sepke, P. Whitman, "Observations and modeling of debris and shrapnel impacts on optics and diagnostics at the National Ignition Facility", EPJ Web of Conferences 59, 08010, 2013,

A. Friedman, R. H. Cohen, D. P. Grote, W. M. Sharp, I. D. Kaganovich, A. E. Koniges, W. Liu, "Heavy Ion Beams and Interactions with Plasmas and Targets (HEDLP and IFE)", NERSC FES Requirements Workshop: Heavy Ion Fusion and Non-Neutral Plasmas, March 2013,

J. Barnard, R. M. More, P. A. Ni, A. Friedman, E. Henestroza, I. Kaganovich, A. Koniges, J. W. Kwan, W. Liu, A. Ng,
B.G. Logan, E. Startsev, M. Terry, A. Yuen,
NDCX-II Experimental Plans and Target Simulations, West Coast High Energy Density Science Cooperative Meeting Berkeley and Palo Alto, California, January 2013,

2012

Wangyi Liu, John Barnard, Alice Koniges, David Eder, Nathan Masters, Aaron Fisher, Alex Friedman, "A numerical scheme for including surface tension effects in hydrodynamic simulation: a full Korteweg type model without parasitic flows", APS DPP 2012, 2012,

Wangyi Liu, Andrea Bertozzi, and Theodore Kolokolnikov, "Diffuse interface surface tension models in an expanding flow", Comm. Math. Sci., 2012, 10(1):387-418,

2011

Wangyi Liu, John Barnard, Alex Friedman, Nathan Masters, Aaron Fisher, Alice Koniges, David Eder, "Modeling droplet breakup effects with applications in the warm dense matter NDCX experiment", APS DPP, 2011,

David Eder, David Bailey, Andrea Bertozzi, Aaron Fisher, Alice Koniges, Wangyi Liu, Nathan Masters, Marty Marniak, Late-Time Numerical Simulations of High-Energy-Density (HED) Targets, Twenty Second International Conference on Numerical Simulations of Plasmas, September 7, 2011,

Wangyi Liu, John Barnard, Alex Friedman, Nathan Masters, Aaron Fisher, Velemir Mlaker,
Alice Koniges, David Eder,
"Modeling droplet breakup effects in warm dense matter experiments with diffuse interface methods in the ALE-AMR code", Proceedings of the 2011 SciDAC Conference, August 4, 2011,

Wangyi Liu, John Bernard, Alex Friedman, Nathan Masters, Aaron Fisher, Velemir Mlaker, Alice Koniges, David Eder, "Modeling droplet breakup effects in warm dense matter experiments with diffuse interface methods in ALE-AMR code", SciDAC conference, 2011,

2010

Wangyi Liu, Martin B. Short, Yasser E. Taima, and Andrea L. Bertozzi, "Multiscale Collaborative Searching Through Swarming", Proceedings of the 7th International Conference on Informatics in Control, Automation, and Robotics (ICINCO), June 2010,

Glenn K. Lockwood

2016

Debbie Bard, Wahid Bhimji, David Paul, Glenn K Lockwood, Nicholas J Wright, Katie Antypas, Prabhat Prabhat, Steve Farrell, Andrey Ovsyannikov, Melissa Romanus, others, "Experiences with the Burst Buffer at NERSC", Supercomputing Conference, November 16, 2016, LBNL LBNL-1007120,

C.S. Daley, D. Ghoshal, G.K. Lockwood, S. Dosanjh, L. Ramakrishnan, N.J. Wright, "Performance Characterization of Scientific Workflows for the Optimal Use of Burst Buffers", Workflows in Support of Large-Scale Science (WORKS-2016), CEUR-WS.org, 2016, 1800:69-73,

Shane Snyder, Philip Carns, Kevin Harms, Robert Ross, Glenn K. Lockwood, Nicholas J. Wright, "Modular HPC I/O characterization with Darshan", Proceedings of the 5th Workshop on Extreme-Scale Programming Tools (ESPT'16), Salt Lake City, UT, November 13, 2016, 9-17, doi: 10.1109/ESPT.2016.9

Contemporary high-performance computing (HPC) applications encompass a broad range of distinct I/O strategies and are often executed on a number of different compute platforms in their lifetime. These large-scale HPC platforms employ increasingly complex I/O subsystems to provide a suitable level of I/O performance to applications. Tuning I/O workloads for such a system is nontrivial, and the results generally are not portable to other HPC systems. I/O profiling tools can help to address this challenge, but most existing tools only instrument specific components within the I/O subsystem that provide a limited perspective on I/O performance. The increasing diversity of scientific applications and computing platforms calls for greater flexibility and scope in I/O characterization.

In this work, we consider how the I/O profiling tool Darshan can be improved to allow for more flexible, comprehensive instru- mentation of current and future HPC I/O workloads.We evaluate the performance and scalability of our design to ensure that it is lightweight enough for full-time deployment on production HPC systems. We also present two case studies illustrating how a more comprehensive instrumentation of application I/O workloads can enable insights into I/O behavior that were not previously possible. Our results indicate that Darshan’s modu- lar instrumentation methods can provide valuable feedback to both users and system administrators, while imposing negligible overheads on user applications.

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Grace X. Y. Zheng, Billy T. Lau, Michael Schnall-Levin, Mirna Jarosz, John M. Bell, Christopher M. Hindson, Sofia Kyriazopoulou-Panagiotopoulou, Donald A. Masquelier, Landon Merrill, Jessica M. Terry, Patrice A. Mudivarti, Paul W. Wyatt, Rajiv Bharadwaj, Anthony J. Makarewicz, Yuan Li, Phillip Belgrader, Andrew D. Price, Adam J. Lowe, Patrick Marks, Gerard M. Vurens, Paul Hardenbol, Luz Montesclaros, Melissa Luo, Lawrence Greenfield, Alexander Wong, David E. Birch, Steven W. Short, Keith P. Bjornson, Pranav Patel, Erik S. Hopmans, Christina Wood, Sukhvinder Kaur, Glenn K. Lockwood, David Stafford, Joshua P. Delaney, Indira Wu, Heather S. Ordonez, Susan M. Grimes, Stephanie Greer, Josephine Y. Lee, Kamila Belhocine, Kristina M. Giorda, William H. Heaton, Geoffrey P. McDermott, Zachary W. Bent, Francesca Meschi, Nikola O. Kondov, Ryan Wilson, Jorge A. Bernate, Shawn Gauby, Alex Kindwall, Clara Bermejo, Adrian N. Fehr, Adrian Chan, Serge Saxonov, Kevin D. Ness, Benjamin J. Hindson, Hanlee P. Ji, "Haplotyping germline and cancer genomes with high-throughput linked-read sequencing", Nature Biotechnology, February 1, 2016, 31:303-311, doi: 10.1038/nbt.3432

Haplotyping of human chromosomes is a prerequisite for cataloguing the full repertoire of genetic variation. We present a microfluidics-based, linked-read sequencing technology that can phase and haplotype germline and cancer genomes using nanograms of input DNA. This high-throughput platform prepares barcoded libraries for short-read sequencing and computationally reconstructs long-range haplotype and structural variant information. We generate haplotype blocks in a nuclear trio that are concordant with expected inheritance patterns and phase a set of structural variants. We also resolve the structure of the EML4-ALK gene fusion in the NCI-H2228 cancer cell line using phased exome sequencing. Finally, we assign genetic aberrations to specific megabase-scale haplotypes generated from whole-genome sequencing of a primary colorectal adenocarcinoma. This approach resolves haplotype information using up to 100 times less genomic DNA than some methods and enables the accurate detection of structural variants.

2015

Kristopher A. Standish, Tristan M. Carland, Glenn K. Lockwood, Wayne Pfeiffer, Mahidhar Tatineni, C Chris Huang, Sarah Lamberth, Yauheniya Cherkas, Carrie Brodmerkel, Ed Jaeger, Lance Smith, Gunaretnam Rajagopal, Mark E. Curran, Nicholas J. Schork, "Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies", BMC Bioinformatics, September 2015, 16, doi: 10.1186/s12859-015-0736-4

Next-generation sequencing (NGS) technologies have become much more efficient, allowing whole human genomes to be sequenced faster and cheaper than ever before. However, processing the raw sequence reads associated with NGS technologies requires care and sophistication in order to draw compelling inferences about phenotypic consequences of variation in human genomes. It has been shown that different approaches to variant calling from NGS data can lead to different conclusions. Ensuring appropriate accuracy and quality in variant calling can come at a computational cost.

We describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes. We explore the influence of many factors that may impact the accuracy and efficiency of group-based variant calling, including group size, the biogeographical backgrounds of the individuals who have been sequenced, and the computing environment used. We make efficient use of the Gordon supercomputer cluster at the San Diego Supercomputer Center by incorporating job-packing and parallelization considerations into our workflow while calling variants on 437 whole human genomes generated as part of large association study.

We ultimately find that our workflow resulted in high-quality variant calls in a computationally efficient manner. We argue that studies like ours should motivate further investigations combining hardware-oriented advances in computing systems with algorithmic developments to tackle emerging ‘big data’ problems in biomedical research brought on by the expansion of NGS technologies.

Glenn K. Lockwood, Rick Wagner, Mahidhar Tatineni, "Storage utilization in the long tail of science", Proceedings of the 2015 XSEDE Conference, July 26, 2015, doi: 10.1145/2792745.2792777

 

The increasing expansion of computations in non-traditional domain sciences has resulted in an increasing demand for research cyberinfrastructure that is suitable for small- and mid-scale job sizes. The computational aspects of these emerging communities are coming into focus and being addressed through the deployment of several new XSEDE resources that feature easy on-ramps, customizable software environments through virtualization, and interconnects optimized for jobs that only use hundreds or thousands of cores; however, the data storage requirements for these emerging communities remains much less well characterized.

To this end, we examined the distribution of file sizes on two of the Lustre file systems within the Data Oasis storage system at the San Diego Supercomputer Center (SDSC). We found that there is a very strong preference for small files among SDSC's users, with 90% of all files being less than 2 MB in size. Furthermore, 50% of all file system capacity is consumed by files under 2 GB in size, and these distributions are consistent on both scratch and projects storage file systems. Because parallel file systems like Lustre and GPFS are optimized for parallel IO to large, widestripe files, these findings suggest that parallel file systems may not be the most suitable storage solutions when designing cyberinfrastructure to meet the needs of emerging communities.

 

2014

Glenn K. Lockwood, Stephen H. Garofalini, "Proton dynamics at the water-silica interface via dissociative molecular dynamics", Journal of Physical Chemistry C, December 26, 2014, 118:29750-2975, doi: 10.1021/jp507640y

A robust and accurate dissociative potential that reproduces the structural and dynamic properties of bulk and nanoconfined water, and proton transport similar to ab initio calculations in bulk water, is used for reactive molecular dynamics simulations of the proton dynamics at the silica/water interface. The simulations are used to evaluate the lifetimes of protonated sites at the interfaces of water with planar amorphous silica surfaces and cylindrical pores in amorphous silica with different densities of water confined in the pores. In addition to lifetimes, the donor/acceptor sites are evaluated and discussed in terms of local atomistic structure. The results of the lifetimes of the protonated sites, including H3O+, SiOH, SiOH2+, and Si–(OH+)–Si sites, are considered. The lifetime of the hydronium ion, H3O+, is considerably shorter near the interface than in bulk water, as are the lifetimes of the other protonated sites. The results indicate the beneficial effect of the amorphous silica surface in enhancing proton transport in wet silica as seen in electrochemical studies and provide the specific molecular mechanisms.

Dong Ju Choi, Glenn K. Lockwood, Robert S. Sinkovits, Mahidhar Tatineni, "Performance of applications using dual-rail InfiniBand 3D torus network on the Gordon supercomputer", Proceedings of the 2014 XSEDE Conference, July 13, 2014, doi: 10.1145/2616498.2616541

Multi-rail InfiniBand networks provide options to improve bandwidth, increase reliability, and lower latency for multi-core nodes. The Gordon supercomputer at SDSC, with its dual-rail InfiniBand 3-D torus network, is used to evaluate the performance impact of using multiple rails. The study was performed using the OSU micro-benchmarks, the P3FFT application kernel, and scientific applications LAMMPS and AMBER. The micro-benchmarks confirmed the bandwidth and latency performance benefits. At the application level, performance improvements depended on the communication level and profile.

Glenn K. Lockwood, Mahidhar Tatineni, Rick Wagner, "SR-IOV: Performance benefits for virtualized interconnects", Proceedings of the 2014 XSEDE Conference, July 13, 2014, doi: 10.1145/2616498.2616537

The demand for virtualization within high-performance computing is rapidly growing as new communities, driven by both new application stacks and new computing modalities, continue to grow and expand. While virtualization has traditionally come with significant penalties in I/O performance that have precluded its use in mainstream large-scale computing environments, new standards such as Single Root I/O Virtualization (SR-IOV) are emerging that promise to diminish the performance gap and make high-performance virtualization possible. To this end, we have evaluated SR-IOV in the context of both virtualized InfiniBand and virtualized 10 gigabit Ethernet (GbE) using micro-benchmarks and real-world applications. We compare the performance of these interconnects on non-virtualized environments, Amazon's SR-IOV-enabled C3 instances, and our own SR-IOV-enabled InfiniBand cluster and show that SR-IOV significantly reduces the performance losses caused by virtualization. InfiniBand demonstrates less than 2% loss of bandwidth and less than 10% increase in latency when virtualized with SR-IOV. Ethernet also benefits, although less dramatically, when SR-IOV is enabled on Amazon's cloud.

Jeff A. Tracey, James K. Sheppard, Glenn K. Lockwood, Amit Chourasia, Mahidhar Tatineni, Robert N. Fisher, Robert S. Sinkovits, "Efficient 3D movement-based kernel density estimator and application to wildlife ecology", Proceedings of the 2014 XSEDE Conference, San Diego, CA, July 13, 2014, doi: 10.1145/2616498.2616541

We describe an efficient implementation of a 3D movement- based kernel density estimator for determining animal space use from discrete GPS measurements. This new method provides more accurate results, particularly for species that make large excursions in the vertical dimension. The downside of this approach is that it is much more computationally expensive than simpler, lower-dimensional models. Through a combination of code restructuring, parallelization and performance optimization, we were able to reduce the time to solution by up to a factor of 1000x, thereby greatly improving the applicability of the method.

Michael Kagan, Glenn K. Lockwood, Stephen H. Garofalini, "Reactive simulations of the activation barrier to dissolution of amorphous silica in water", Physical Chemistry Chemical Physics, May 28, 2014, 16:9294-9301, doi: 10.1039/c4cp00030g

Molecular dynamics simulations employing reactive potentials were used to determine the activation barriers to the dissolution of the amorphous SiO2 surface in the presence of a 2 nm overlayer of water. The potential of mean force calculations of the reactions of water molecules with 15 different starting Q4 sites (Qi is the Si site with i bridging oxygen neighbors) to eventually form the dissolved Q0 site were used to obtain the barriers. Activation barriers for each step in the dissolution process, from the Q4 to Q3 to Q2 to Q1 to Q0 were obtained. Relaxation runs between each reaction step enabled redistribution of the water above the surface in response to the new Qi site configuration. The rate-limiting step observed in the simulations was in both the Q32 reaction (a Q3 site changing to a Q2 site) and the Q21 reaction, each with an average barrier of ∼14.1 kcal mol(-1). However, the barrier for the overall reaction from the Q4 site to a Q0 site, averaged over the maximum barrier for each of the 15 samples, was 15.1 kcal mol(-1). This result is within the lower end of the experimental data, which varies from 14-24 kcal mol(-1), while ab initio calculations using small cluster models obtain values that vary from 18-39 kcal mol(-1). Constraints between the oxygen bridges from the Si site and the connecting silica structure, the presence of pre-reaction strained siloxane bonds, and the location of the reacting Si site within slight concave surface contours all affected the overall activation barriers.

2013

Glenn K. Lockwood, Stephen H. Garofalini, "Lifetimes of excess protons in water using a dissociative water potential", Journal of Physical Chemistry B, April 8, 2013, 117:4089-4097, doi: 10.1021/jp310300x

Molecular dynamics simulations using a dissociative water potential were applied to study transport of excess protons in water and determine the applicability of this potential to describe such behavior. While originally developed for gas-phase molecules and bulk liquid water, the potential is transferrable to nanoconfinement and interface scenarios. Applied here, it shows proton behavior consistent with ab initio calculations and empirical models specifically designed to describe proton transport. Both Eigen and Zundel complexes are observed in the simulations showing the Eigen–Zundel–Eigen-type mechanism. In addition to reproducing the short-time rattling of the excess proton between the two oxygens of Zundel complexes, a picosecond-scale lifetime was also found. These longer-lived H3O+ ions are caused by the rapid conversion of the local solvation structure around the transferring proton from a Zundel-like form to an Eigen-like form following the transfer, effectively severing the path along which the proton can rattle. The migration of H+ over long times (>100 ps) deviates from the conventional short-time multiexponentially decaying lifetime autocorrelation model and follows the t–3/2 power-law behavior. The potential function employed here matches many of the features of proton transport observed in ab initio molecular dynamics simulations as well as the highly developed empirical valence bond models, yet is computationally very efficient, enabling longer time and larger systems to be studied.

2012

Glenn K. Lockwood, Stephen H. Garofalini, "Reactions between water and vitreous silica during irradiation", Journal of Nuclear Materials, November 30, 2012, 430:239-245, doi: 10.1016/j.jnucmat.2012.07.004

Molecular dynamics simulations were conducted to determine the response of a vitreous silica surface in contact with water to radiation damage. The defects caused by radiation damage create channels that promote high H+ mobility and result in significantly higher concentration and deeper penetration of H+ in the silica subsurface. These subsurface H+ hop between acidic sites such as SiOH2+ and Si–(OH)–Si until subsequent radiation ruptures siloxane bridges and forms subsurface non-bridging oxygens (NBOs); existing excess H+ readily bonds to these NBO sites to form SiOH. The high temperature caused by irradiation also promotes the diffusion of molecular H2O into the subsurface, and although H2O does not penetrate as far as H+, it readily reacts with ruptured bridges to form 2SiOH. These SiOH sites are thermally stable and inhibit the reformation of bridges that would otherwise occur in the absence of water. In addition to this reduction of self-healing, the presence of water during the self-irradiation of silica may cause an increase in the glass’s proton conductivity.

2011

Ying Ma, Glenn K. Lockwood, Stephen H. Garofalini, "Development of a transferable variable charge potential for the study of energy conversion materials FeF2 and FeF3", Journal of Physical Chemistry C, November 18, 2011, 115:24198-2420, doi: 10.1021/jp207181s

A variable charge potential is developed that is suitable for the simulations of energy conversion materials FeF2 and FeF3. Molecular dynamics simulations using this potential show that the calculated structural and elastic properties of both FeF2 and FeF3 are in good agreement with experimental data. Such a transferability of this potential rests in the fact that the difference in the bond characteristic between FeF2 and FeF3 is properly accounted for by the variable charge approach. The calculated equilibrium charges are also in excellent agreement with first-principles Bader charges. Surface energies obtained by the variable charge method are closer to the first-principles data than are fixed charge models, indicating the importance of variable charge method for the simulations of the surface. A significant decrease in atomic charges is observed only for the outermost one or two layers, which is also observed in the first-principles calculations.

2010

Glenn K. Lockwood, Stephen H. Garofalini, "Effect of moisture on the self-healing of vitreous silica under irradiation", Journal of Nuclear Materials, February 16, 2010, 400:73-78, doi: 10.1016/j.jnucmat.2010.02.012

Although it is widely understood that water interacts extensively with vitreous silicates, atomistic simulations of the response of these materials to ballistic radiation, such as neutron or ion radiation, have excluded moisture. In this study, molecular dynamics simulations were used to simulate the collision cascades and defect formation that would result from such irradiation of silica in the presence of moisture. Using an interatomic potential that allows for the dissociation of water, it was found that the reaction between molecular water or pre-dissociated water (as OH− and H+) and the ruptured Si–O–Si bonds that result from the collision cascade inhibits a significant amount of the structural recovery that was previously observed in atomistic simulations of irradiation in perfectly dry silica. The presence of moisture not only resulted in a greater accumulation of non-bridging oxygen defects, but reduced the local density of the silica and altered the distribution of ring sizes. The results imply that an initial presence of moisture in the silica during irradiation could increase the propensity for further ingress of moisture via the low density pathways and increased defect concentration.

2009

Glenn K. Lockwood, Stephen H. Garofalini, "Bridging oxygen as a site for proton adsorption on the vitreous silica surface", Journal of Chemical Physics, August 21, 2009, 131:074703, doi: 10.1063/1.3205946

Molecular dynamics computer simulations were used to study the protonation of bridging oxygen (Si-O-Si) sites present on the vitreous silica surface in contact with water using a dissociative water potential. In contrast to first-principles calculations based on unconstrained molecular analogs, such as H7Si2O7+ molecules, the very limited flexibility of neighboring SiO4 tetrahedra when embedded in a solid surface means that there is a relatively minor geometric response to proton adsorption, requiring sites predisposed to adsorption. Simulation results indicate that protonation of bridging oxygen occurs at predisposed sites with bridging angles in the 125°-135° range, well below the bulk silica mean of approximately 150°, consistent with various ab initio calculations, and that a small fraction of such sites are present in all ring sizes. The energy differences between dry and protonated bridges at various angles observed in the simulations coincide completely with quantum calculations over the entire range of bridging angles encountered in the vitreous silica surface. Those sites with bridging angles near 130° support adsorbed protons more stably, resulting in the proton remaining adsorbed for longer periods of time. Vitreous silica has the necessary distribution of angular strain over all ring sizes to allow protons to adsorb onto bridging oxygen at the surface, forming acidic surface groups that serve as ideal intermediate steps in proton transfer near the surface. In addition to hydronium formation and water-assisted proton transfer in the liquid, protons can rapidly move across the water-silica interface via strained bridges that are predisposed to transient proton adsorption. Thus, an excess proton at any given location on a silica surface can move by either water-assisted or strained bridge-assisted diffusion depending on the local environment. The result of this would be net migration that is faster than it would be if only one mechanism is possible. These simulation results indicate the importance of performing large size and time scale simulations of the structurally heterogeneous vitreous silica exposed to water to describe proton transport at the interface between water and the silica surface.

2008

Glenn K. Lockwood, Shenghong Zhang, Stephen H. Garofalini, "Anisotropic dissolution of α-alumina (0001) and (1120) surfaces into adjoining silicates", Journal of the American Ceramic Society, October 24, 2008, 91:3536-3541, doi: 10.1111/j.1551-2916.2008.02715.x

The dissolutions of the (0001) and (1120) orientations of α-Al2O3 into calcium silicate, aluminosilicate, and calcium aluminosilicate melts were modeled using molecular dynamics simulations. In all cases, it was found that the (1120) surface of the crystal destabilizes and melts at a lower temperature than does the (0001) surface. This anisotropy in dissolution counters the anisotropy in grain growth, in which the outward growth of the (1120) surface occurs more rapidly than that on the (0001) surface, causing platelets. However, anisotropic dissolution occurred only at a certain temperature range, above which dissolution behavior was isotropic. The presence of calcium in the contacting silicate melt plays an important role in this anisotropic dissolution, similar to its role in anisotropic grain growth observed previously. However, anisotropic dissolution also occurs in the silicate melts not containing calcium, indicating the importance of the different surface energies. In combination with previous simulations of anisotropic grain growth in alumina, these simulations reveal a complex kinetic competition between preferential adsorption and growth versus preferential dissolution of the (1120) orientation in comparison with the (0001) orientation as a function of temperature and local composition. This, in turn, indicates potential processing variations in which to design morphology in alumina.

Colin MacLean

2016

Colin A. MacLean, "Maintaining Large Software Stacks in a Cray Ecosystem with Gentoo Portage", Cray User Group, London, England, 2016,

2015

Colin A. MacLean, Neil C. Hong, James Prendergast, "hapbin: An Efficient Program for Performing Haplotype-Based Scans for Positive Selection in Large Genomic Datasets", Mol Biol Evol., November 2015, 32(11):3027-9, doi: 10.1093/molbev/msv172

Filipe Maia

2010

Filipe Maia, Chao Yang, Stefano Marchesini, "Compressive Auto-Indexing in Femtosecond Nanocrystallography", Lawrence Berkeley National Laboratory technical report, 2010, LBNL 4008E,

   

Filipe Maia, Alastair MacDowell, Stefano Marchesini, Howard A. Padmore, Dula Y. Parkinson, Jack Pien, Andre Schirotzek, Chao Yang, "Compressive Phase Contrast Tomography", SPIE Optics and Photonics, San Diego, CA, 2010,

    

Tareq Majed Malas

2016

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, Volume 9945, October 6, 2016, doi: 10.1007/978-3-319-46079-6_24

Tareq Malas, Thorsten Kurth, Jack Deslippe, "Optimization of the sparse matrix-vector products of an IDR Krylov iterative solver in EMGeo for the Intel KNL manycore processor", Springer Lecture Notes in Computer Science, October 6, 2016,

T. Malas, J. Hornich, G. Hager, H. Ltaief, C. Pflaum, D. Keyes, "Optimization of an electromagnetics code with multi-core wavefront diamond blocking and multi-dimensional intra-tile parallelization", International Parallel and Distributed Processing Symposium, 2016,

2015

T. Malas, G. Hager, H. Ltaief, H. Stengel, G. Wellein, D. Keyes, "Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates", SIAM Journal on Scientific Computing, 2015, 37:C439-C464, doi: 10.1137/140991133

T. Malas, G. Hager, H. Ltaief, D. Keyes, Advanced tiling techniques for memory-starved streaming numerical kernels, 2015,

T. Malas, G. Hager, H. Ltaief, D. Keyes, Multi-dimensional intra-tile parallelization for memory-starved stencil computations, arXiv preprint arXiv:1510.04995, 2015,

T. M. Malas, Tiling and Asynchronous Communication Optimizations for Stencil Computations, 2015,

2014

T. Malas, G. Hager, H. Ltaief, D. Keyes, Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking, arXiv preprint arXiv:1410.5561, 2014,

T. Malas, G. Hager, H. Ltaief, H. Stengel, G. Wellein, D. Keyes, Optimizing Stencil Computations: Multicore-optimized wavefront diamond blocking on Shared and Distributed Memory Systems, 2014,

2013

Tareq Malas, Aron J. Ahmadia, Jed Brown, John A. Gunnels, David E. Keyes, "Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450 processor", International Journal of High Performance Computing Applications, 2013, 27:193-209, doi: 10.1177/1094342012444795

2011

T. M. Malas, Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450, 2011,

Krishna Muriki

2010

Keith R. Jackson, Ramakrishnan, Muriki, Canon, Cholia, Shalf, J. Wasserman, Nicholas J. Wright, "Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud", CloudCom, January 1, 2010, 159-168,

Hai Ah Nam

2016

K.S. Hemmert, M. Rajan, R. Hoekstra, M.W. Glass, S.D. Hammond, S. Dawson, M. Vigil, D. Grunau, J. Lujan, D. Morton, H. Nam, P. Peltz Jr., A. Torrez, C. Wright, "Trinity: Architecture and Early Experience", Proceedings of the Cray Users Group Conference, London, UK, May 2016,

Ang, JA, Cook, J, Domino, SP, Glass, MW, Hammond, SD, Hemmert, KS, Heroux, Hoekstra, RJ, Laros III, JH, Lin, PT, Nam, H, Neely, R, Rodrigues, AF, Trott, CR, "Exascale Co-design Progress and Accomplishments.", "Advances in Parallel Computing series," Fox, G, Getov, V, Grandinetti, L, Joubert, G, Sterling, T (Eds.), Convergence of Big Data and High-Performance Computing, IOS Press, Amsterdam, New York, Tokyo, SAND2017-8788B, ( 2016)

2015

Stephen Lien Harrell, Hai Ah Nam, Ver\ onica G. Vergara Larrea, Kurt Keville, Dan Kamalic, "Student Cluster Competition: A Multi-Disciplinary Undergraduate HPC Educational Tool", EduHPC 15, New York, NY, USA, Association for Computing Machinery, 2015, doi: 10.1145/2831425.2831428

2014

S. Parete-Koon, B. Caldwell, S. Canon, E. Dart, J. Hick, J. Hill, C. Layton, D. Pelfrey, G. Shipman, D. Skinner, H. Nam, J. Wells, J. Zurawski, "HPC’s Pivot to Data", Proceedings of the Cray Users Group Conference, Lugano, Switzerland, May 2014,

2013

S. Bogner,A. Bulgac, J. Carlson, J. Engel, G. Fann, R.J. Furnstahl, S. Gandolfi, G. Hagen, M. Horoi, C. Johnson, M. Kortelainen, E. Lusk, P. Maris, H. Nam, P. Navratil, W. Nazarewicz, E. Ng, G.P.A. Nobre, E. Ormand, T. Papenbrock, J. Pei, S.C. Pieper, S. Quaglioni, K.J. Roche, J. Sarich, N. Schunck, M. Sosonkina, J. Terasaki, I. Thompson, J.P. Vary, S.M. Wild, "Computational nuclear quantum many-body problem: The UNEDF project", CPC 184 2235-2250 (2013), October 1, 2013,

H. Nam and D.J. Dean, "Re-entrance in nuclei: competitive phenomena", J. Phys.: Conf. Ser. 445 012029 (2013), July 2013,

M. V. Stoitsov, N. Schunck, M. Kortelainen, N. Michel, H. Nam, E. Olsen, J. Sarich, S. Wild, "Axially deformed solution of the Skyrme-Hartree-Fock-Bogolyubov equations using the transformed harmonic oscillator basis (II)", HFBTHO v2.00c: a new version of the program, arXiv:1210.1825 [nucl-th], 2013,

Hai Ah Nam, Jason Hill, Suzanne Parete-Koon, "The Practical Obstacles of Data Transfer: Why Researchers Still Love Scp", NDM 13, New York, NY, USA, Association for Computing Machinery, 2013, doi: 10.1145/2534695.2534703

2012

D. Duke, H. Carr, A. Knoll, N. Schunck, H. Nam, A. Staszczak,, "Visualizing Nuclear Scission Through a Multifield Extension of Topological Analysis", IEEE Trans. Visualization and Computer Graphics, vol. 18, no. 12, pp. 2033-2040, Dec. 2012, December 1, 2012,

D. A. Pigg, G. Hagen, H. Nam, T. Papenbrock, "Time-dependent coupled-cluster method for atomic nuclei", Phys. Rev. C 86, 014308 (2012), July 2012,

H. Nam, M. Stoitsov, W. Nazarewicz, A. Bulgac, G. Hagen, M. Kortelainen, P. Maris, J. C. Pei, K. J. Roche, N. Schunck, I. Thompson, J. P. Vary, S. M. Wild
J.,
"UNEDF: Advanced Scientific Computing Collaboration Transforms the Low-Energy Nuclear Many-Body Problem", Phys.: Conf. Ser. 402 012033, 2012,

2011

G. Hagen and H. Nam, "Computational aspects of nuclear coupled-cluster theory", Proceedings of the Yukawa International Seminar 2011 (YKIS2011) and Long-Term Workshop on Dynamics and Correlations in Exotic Nuclei (DCEN2011), October 2011,

M. Stoitsov, H. Nam, W. Nazarewicz, A. Bulgac, G. Hagen, M. Kortelainen, J. C. Pei, K. J. Roche, N. Schunck, I. Thompson, J.P. Vary, S. M. Wild, "UNEDF: Advanced Scientific Computing Transforms the Low-Energy Nuclear Many-Body Problem", Proceedings for SciDAC 2011, Denver, CO, July 2011,

Rebecca J. Hartman-Baker, Hai Ah Nam, "Optimizing Nuclear Physics Codes on the XT5", Proceedings of CUG 2011, 2011,

P. Maris, J.P. Vary, P. Navratil, W.E. Ormand, H. Nam, D.J. Dean, "Origin of the anomalous long lifetime of 14C", Phys. Rev. Lett. 105, 202502, 2011,

2010

D.J. Dean, K. Langanke, H. Nam, and W. Nazarewicz, Reentrance Phenomenon in Heated Rotating Nuclei in the Shell Model Monte Carlo Approach, Phys. Rev. Lett. 105, 212504 (2010), "Reentrance Phenomenon in Heated Rotating Nuclei in the Shell Model Monte Carlo Approach", Phys. Rev. Lett. 105, 212504, November 2010,

Fernando Fuentes, Houssain Kettani, George Ostrouchov, Mario Stoitsov, Hai Ah Nam, "Exploration of High-Dimensional Nuclei Data", ICCSN, pp.521-524, 2010 Second International Conference on Communication Software and Networks, February 2010,

2009

W. Joubert, D. Kothe, H. Nam, "Preparing for Exascale: ORNL Leadership Computing Facility Application Requirements and Strategy", https://www.olcf.ornl.gov/olcf-media/center-reports/, December 2009,

H. Nam, D. J. Dean, J.P. Vary, and P. Maris, "Computing Atomic Nuclei on the Cray XT5", Proceedings of the Cray Users Group Conference, Atlanta, GA, May 2009,

2007

2006

Calvin W. Johnson and Hai Ah Nam, "Collective behavior in random interaction", Revista Mexicana de Fisica S 52(4)  (2006) 44-48, November 2006,

Praveen Narayanan

2013

Alice Koniges, Praveen Narayanan, Robert Preissl, Xuefei Yuan, Proxy Design and Optimization in Fusion and Accelerator Physics, SIAM Conference on Computational Science and Engineering, February 25, 2013,

2011

Sean Farley, Ben Dudson, Praveen Narayanan, Lois Curfman McInnes, Maxim Umansky, Xuexiao Xu, Satish Balay, John Cary, Alice Koniges, Carol Woodward, Hong Zhang, "BOUT++: Performance Characterization and Recent Advances in Design", International Conference on Numerical Simulations of Plasmas, Long Beach, New Jersey, 2011, 2011,

Praveen Narayanan, Alice Koniges, Leonid Oliker, Robert Preissl, Samuel Williams, Nicholas J Wright, Maxim Umansky, Xueqiao Xu, Benjamin Dudson, Stephane Ethier, Weixing Wang, Jeff Candy, John R. Cary, "Performance Characterization for Fusion Co-design Applications", Proceedings of CUG, 2011,

Trever Christian Nightingale

2013

Trever Nightingale, Introduction to Google Chromebooks, LBL Tech Day Lightning Talk, October 10, 2013,

Introduction to Google Chromebooks.  The good parts reviewers often leave out.

2011

T. Nightingale, No Cost ZFS On Low Cost Hardware, NERSC High Performance Computing Seminar, June 14, 2011,

Today's data generation rates and terabyte hard drives have led to a new breed of commodity servers that have very large filesystems. This talk looks at the implications and describes the decision to deploy ZFS under FreeBSD on NERSC servers. Included will be a look at pertinent ZFS features, configuration decisions adopted, experiences with ZFS so far, and how we are using the features ZFS brings with it to gain some new functionality that was not possible with previous filesystems.

Leonid Oliker

2016

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, Volume 9945, October 6, 2016, doi: 10.1007/978-3-319-46079-6_24

2013

Hongzhang Shan, Brian Austin, Wibe De Jong, Leonid Oliker, Nicholas Wright, Edoardo Apra, "Performance Tuning of Fock Matrix and Two-Electron Integral Calculations for NWChem on Leading HPC Platforms", SC'13, November 11, 2013,

2011

J. Dongarra, P. Beckman, T. Moore, P. Aerts, G. Aloisio, J.C. Andre, D. Barkai, J.Y. Berthou, T. Boku, B. Braunschweig, others, "The international exascale software project roadmap", International Journal of High Performance Computing Applications, January 2011, 25:3--60,

Robert Preissl, Wichmann, Long, Shalf, Ethier, Alice E. Koniges, "Multithreaded global address space communication techniques for gyrokinetic fusion applications on ultra-scale platforms", SC, 2011, 15,

Kamesh Madduri, Z. Ibrahim, Williams, Im, Ethier, Shalf, Leonid Oliker, "Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems", SC, January 1, 2011, 23,

Samuel Williams, Oliker, Carter, John Shalf, "Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning", SC, January 1, 2011, 55,

Jens Krueger, Donofrio, Shalf, Mohiyuddin, Williams, Oliker, Franz-Josef Pfreund, "Hardware/software co-design for energy-efficient seismic modeling", SC, January 1, 2011, 73,

2010

Jack Dongarra, John Shalf, David Skinner, Kathy Yelick, "International Exascale Software Project (IESP) Roadmap, version 1.1", October 18, 2010,

Gilbert Hendry, Johnnie Chan, Shoaib Kamil, Lenny Oliker, John Shalf, Luca P. Carloni, and Keren Bergman, "Silicon Nanophotonic Network-On-Chip Using TDM Arbitration", IEEE Symposium on High Performance Interconnects (HOTI) 5.1, August 2010,

S. Ethier, M. Adams, J. Carter, L. Oliker, "Petascale Parallelization of the Gyrokinetic Toroidal Code", VECPAR: High Performance Computing for Computational Science, June 2010,

A. Chandramowlishwaran, S. Williams, L. Oliker, I. Lashuk, G. Biros, R. Vuduc, "Optimizing and Tuning the Fast Multipole Method for State-of-the-Art Multicore Architectures", Proceedings of 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), April 2010,

 

Shoaib Kamil, Chan, Oliker, Shalf, Samuel Williams, "An auto-tuning framework for parallel multicore stencil computations", IPDPS, January 1, 2010, 1-12,

K. Datta, S. Williams, V. Volkov, J. Carter, L. Oliker, J. Shalf, K. Yelick, "Auto-Tuning Stencil Computations on Diverse Multicore Architectures", Scientific Computing with Multicore and Accelerators, edited by Jakub Kurzak, David A. Bader, Jack Dongarra, 2010,

Shoaib Kamil, Oliker, Pinar, John Shalf, "Communication Requirements and Interconnect Optimization for High-End Scientific Applications", IEEE Trans. Parallel Distrib. Syst., 2010, 21:188-202,

Andrew Uselton, Howison, J. Wright, Skinner, Keen, Shalf, L. Karavanic, Leonid Oliker, "Parallel I/O performance: From events to ensembles", IPDPS, 2010, 1-11,

2009

David Donofrio, Oliker, Shalf, F. Wehner, Rowen, Krueger, Kamil, Marghoob Mohiyuddin, "Energy-Efficient Computing for Extreme-Scale Science", IEEE Computer, January 1, 2009, 42:62-71,

Samuel Williams, Carter, Oliker, Shalf, Katherine A. Yelick, "Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms", J. Parallel Distrib. Comput., January 1, 2009, 69:762-777,

Kaushik Datta, Kamil, Williams, Oliker, Shalf, Katherine A. Yelick, "Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors", SIAM Review, January 1, 2009, 51:129-159,

2008

M. Wehner, L. Oliker, J. Shalf, Ultra-Efficient Exascale Scientific Computing, talk, January 1, 2008,

Samuel Williams, Carter, Oliker, Shalf, Katherine A. Yelick, "Lattice Boltzmann simulation optimization on leading multicore platforms", IPDPS, January 1, 2008, 1-14,

Kaushik Datta, Murphy, Volkov, Williams, Carter, Oliker, A. Patterson, Shalf, Katherine A. Yelick, "Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures", SC, January 1, 2008, 4,

Leonid Oliker, Canning, Carter, Shalf, St\ ephane Ethier, "Scientific Application Performance On Leading Scalar and Vector Supercomputering Platforms", IJHPCA, January 1, 2008, 22:5-20,

Michael F. Wehner, Oliker, John Shalf, "Towards Ultra-High Resolution Models of Climate and Weather", IJHPCA, January 1, 2008, 22:149-165,

2007

Leonid Oliker, Canning, Carter, Iancu, Lijewski, Kamil, Shalf, Shan, Strohmaier, Ethier, Tom Goodale, "Scientific Application Performance on Candidate PetaScale Platforms", IPDPS, January 1, 2007, 1-12,

Julian Borrill, Oliker, Shalf, Hongzhang Shan, "Investigation of leading HPC I/O performance using a scientific-application derived benchmark", SC, January 1, 2007, 10,

S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, J. Demmel, Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms, International Conference for High-Performance Computing, Networking, Storage, and Analysis, January 1, 2007,

Shoaib Kamil, Pinar, Gunter, Lijewski, Oliker, John Shalf, "Reconfigurable hybrid interconnection for static and dynamic scientific applications", Conf. Computing Frontiers, January 1, 2007, 183-194,

John Shalf, Shoaib Kamil, David Skinner, Leonid Oliker, Interconnect Requirements for HPC Applications, talk, January 1, 2007,

Samuel Williams, Oliker, W. Vuduc, Shalf, A. Yelick, James Demmel, "Optimization of sparse matrix-vector multiplication on emerging multicore platforms", SC, January 1, 2007, 38,

Samuel Williams, Shalf, Oliker, Kamil, Husbands, Katherine A. Yelick, "Scientific Computing Kernels on the Cell Processor", International Journal of Parallel Programming, January 1, 2007, 35:263-298,

2006

Jonathan Carter, Tony Drummond, Parry Husbands, Paul Hargrove, Bill Kramer, Osni Marques, Esmond Ng, Lenny Oliker, John Shalf, David Skinner, Kathy Yelick, "Software Roadmap to Plug and Play Petaflop/s", Lawrence Berkeley National Laboratory Technical Report, #59999, July 31, 2006,

Samuel Williams, Shalf, Oliker, Kamil, Husbands, Katherine A. Yelick, "The potential of the cell processor for scientific computing", Conf. Computing Frontiers, January 1, 2006, 9-20,

Jonathan Carter, Oliker, John Shalf, "Performance Evaluation of Scientific Applications on Modern Parallel Vector Systems", VECPAR, January 1, 2006, 490-503,

L. Oliker, S. Kamil, A. Canning, J. Carter, C. Iancu, J. Shalf, H. Shan, D. Skinner, E. Strohmaier, T. Goodale, "Application Scalability and Communication Signatures on Leading Supercomputing Platforms", January 1, 2006,

Shoaib Kamil, Datta, Williams, Oliker, Shalf, Katherine A. Yelick, "Implicit and explicit optimizations for stencil computations", Memory System Performance and Correctness, January 1, 2006, 51-60,

2005

John Shalf, Kamil, Oliker, David Skinner, "Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect", SC, January 1, 2005, 17,

Horst Simon, William Kramer, William Saphir, John Shalf, David Bailey, Leonid Oliker, Michael Banda, C. William McCurdy, John Hules, Andrew Canning, Marc Day, Philip Colella, David Serafini, Michael Wehner, Peter Nugent, "Science-Driven System Architecture: A New Process for Leadership Class Computing", Journal of the Earth Simulator, January 1, 2005, 2,

S. Kamil, J. Shalf, L. Oliker, D. Skinner, "Understanding ultra-scale application communication requirements", Workload Characterization Symposium, 2005. Proceedings of the IEEE International, January 1, 2005, 178--187,

Shoaib Kamil, Husbands, Oliker, Shalf, Katherine A. Yelick, "Impact of modern memory subsystems on cache optimizations for stencil computations", Memory System Performance, January 1, 2005, 36-43,

Leonid Oliker, Canning, Carter, Shalf, Skinner, Ethier, Biswas, Jahed Djomehri, Rob F. Van der Wijngaart, "Performance evaluation of the SX-6 vector architecture for scientific computations", Concurrency - Practice and Experience, January 1, 2005, 17:69-93,

2004

Gorden Griem, Oliker, Shalf, Katherine A. Yelick, "Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe", IPDPS, January 1, 2004,

Leonid Oliker, Canning, Carter, Shalf, St\ ephane Ethier, "Scientific Computations on Modern Parallel Vector Systems", SC, January 1, 2004, 10,

2003

Leonid Oliker, Canning, Carter, Shalf, Skinner, Ethier, Biswas, Jahed Djomehri, Rob F. Van der Wijngaart, "Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations", SC, January 1, 2003, 38,

Andrey Ovsyannikov

2016

A. Ovsyannikov, Performance Advantages of Using a Burst Buffer for Scientific Workflows, Bay Area Scientific Computing Day (BASCD 2016), December 3, 2016,

Debbie Bard, Wahid Bhimji, David Paul, Glenn K Lockwood, Nicholas J Wright, Katie Antypas, Prabhat Prabhat, Steve Farrell, Andrey Ovsyannikov, Melissa Romanus, others, "Experiences with the Burst Buffer at NERSC", Supercomputing Conference, November 16, 2016, LBNL LBNL-1007120,

A. Ovsyannikov, Case study: Chombo-Crunch and VisIt for carbon sequestration, Supercomputing Conference, Birds of a Feather: "Burst Buffers: Early Experiences and Outlook", November 15, 2016,

Andrey Ovsyannikov, Melissa Romanus, Brian Van Straalen, Gunther H Weber, David Trebotich, "Scientific workflows at datawarp-speed: Accelerated data-intensive science using nersc's burst buffer", 2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS), Salt Lake City, UT, USA, IEEE, November 14, 2016, 1--6, LBNL LBNL-1006680, doi: 10.1109/PDSW-DISCS.2016.005

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

Andrey Ovsyannikov, Enabling high-performance simulation of subsurface flows with Chombo-Crunch on Intel Xeon Phi, 2016 IXPUG US Annual Meeting, September 21, 2016,

Andrey Ovsyannikov, Science with the Burst Buffer, NERSC Data Day, August 22, 2016,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Andrey Ovsyannikov, Chombo-Crunch and VisIt for Carbon Sequestration and In-Transit Data Analysis Using Burst Buffers, DOE Centers of Excellence Performance Portability Meeting, April 21, 2016,

2015

Andrey Ovsyannikov, Brian Van Straalen, Daniel Graves, David Trebotich, "Porting Chombo-Crunch to next-generation architecture", Bay Area Scientific Computing Day, December 11, 2015,

A. Ovsyannikov, D. Kim, A. Mani, P. Moin, "Numerical study of turbulent two-phase Couette flow", Pages: 41-52 January 1, 2015,

2014

V. Sabelnikov, A. Ovsyannikov, M. Gorokhovski, "Modified level set equation and its numerical assessment", Journal of Computational Physics, December 1, 2014, 278:1-30, doi: 10.1016/j.jcp.2014.08.018

D. Khotyanovsky, A. Kudryavtsev, A. Ovsyannikov, "A comparative study of accuracy of shock capturing schemes for simulation of shock/acoustic wave interactions", International Journal of Aeroacoustics, June 1, 2014, 13:261-274, doi: 10.1260/1475-472X.13.3-4.261

2013

2012

A. Ovsyannikov, V. Sabelnikov, M. Gorokhovski, "A new level set equation and its numerical assessments", Proceedings of the Summer Program, Center for Turbulence Research, Stanford University, November 1, 2012, 315-324,

V. Sabelnikov, A. Ovsyannikov, M. Gorokhovski, "Modified level set equation for gas-liquid interface and its numerical solution", ICLASS 2012, 12th Triennial International Conference on Liquid, Heidelberg, Germany, September 2, 2012,

V. Sabelnikov, A. Ovsyannikov, M. Gorokhovski, "Modified level set equation for gas-liquid interface and its numerical solution", ECCOMAS 2012 - European Congress on Computational Methods in Applied Sciences and Engineering, Vienna, Austria, September 1, 2012,

2010

A. Kudryavstev, A. Ovsyannikov, "Numerical investigation of the interaction of acoustic disturbances with a shock wave", TsAGI Science Journal, January 1, 2010, 41:47-57, doi: 10.1615/TsAGISciJ.v41.i1.50

Jefferson R. Porter

2016

Lisa Gerhardt, Jeff Porter, Nick Balthaser, Lessons Learned from Running an HPSS Globus Endpoint, 2016 HPSS User Forum, September 1, 2016,

The NERSC division of LBNL has been running HPSS in production since 1998. The archive is quite popular with roughly 100TB IO every day from the ~6000 scientists that use the NERSC facility. We maintain a Globus-HPSS endpoint that transfers over 1PB / month of data into and out of HPSS. Getting Globus and HPSS to mesh well can be challenging. This talk gives an overview of some of the lessons learned.

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

2008

Shreyas Cholia, R. Jefferson Porter, "Publication and Protection of Sensitive Site Information in a Grid Infrastructure", 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID) p. 639-644, IJGHPC 1(2): 56-73 (2009), May 30, 2008,

Robert Preissl

2013

Alice Koniges, Praveen Narayanan, Robert Preissl, Xuefei Yuan, Proxy Design and Optimization in Fusion and Accelerator Physics, SIAM Conference on Computational Science and Engineering, February 25, 2013,

2012

Alice Koniges, Katherine Yelick, Rolf Rabenseifner, Reinhold Bader, David Eder, Filip Blagojevic, Robert Preissl, Paul Hargrove, Introduction to PGAS (UPC and CAF) and Hybrid for Multicore Programming, SC12 Full Day Tutorial, November 2012,

2011

Praveen Narayanan, Alice Koniges, Leonid Oliker, Robert Preissl, Samuel Williams, Nicholas J Wright, Maxim Umansky, Xueqiao Xu, Benjamin Dudson, Stephane Ethier, Weixing Wang, Jeff Candy, John R. Cary, "Performance Characterization for Fusion Co-design Applications", Proceedings of CUG, 2011,

2010

Robert Preissl, Bronis R. de Supinski, Martin Schulz, Daniel J. Quinlan, Dieter Kranzlmüller, Thomas Panas, "Exploitation of Dynamic Communication Patterns through Static Analysis", Proc. International Conference on Parallel Processing (ICPP), September 13, 2010,

Alice Koniges, Robert Preissl, Jihan Kim, D Eder, A Fisher, N Masters, V Mlaker, S Ethier, W Wang, M Head-Gordon, N Wichmann, "Application acceleration on current and future Cray platforms,", Proc. Cray User Group Meeting, Edinburgh, Scotland, May 2010,

Alice Koniges, Robert Preissl, Stephan Ethier, John Shalf, What’s Ahead for Fusion Computing?, International Sherwood Fusion Theory Conference, April 2010,

Robert Preissl, Alice Koniges, Stephan Ethier, Weixing Wang, Nathan Wichmann, "Overlapping communication with computation using OpenMP tasks on the GTS magnetic fusion code", Scientific Programming, 2010, 18:139--151, doi: 10.3233/SPR-2010-0311

Evan Racah

2016

Evan Racah, Seyoon Ko, Peter Sadowski, Wahid Bhimji, Craig Tull, Sang-Yun Oh, Pierre Baldi, Prabhat, "Revealing Fundamental Physics from the Daya Bay Neutrino Experiment using Deep Neural Networks", ICMLA, 2016,

Michael Ringenburg, Shuxia Zhang, Kristyn
Maschhoff, Bill Sparks, Evan Racah, Prabhat,
"Characterizing the Performance of Analytics Workloads on the Cray XC40", Cray User Group, May 13, 2016,

Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon,
Alex Gittens, Lisa Gerhardt, Suren Byna, Mike F. Ringenburg, Prabhat,
"H5Spark: Bridging the I/O Gap between Spark and Scientific Data Formats on HPC Systems", Cray User Group, May 13, 2016,

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.

Mostofa Patwary, Nadathur Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey, "PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures", IPDPS 2016, April 5, 2016,

Alex Gittens, Jey Kottalam, Jiyan Yang, Michael F Ringenburg, Jatin Chhugani, Evan Racah, Mohitdeep Singh, Yushu Yao, Curt Fischer, Oliver Ruebel, Benjamin Bowen, Norman Lewis, Michael W Mahoney, Venkat Krishnamurthy, Prabhat, "A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark", The 5th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics, IPDPS, February 1, 2016,

2015

Prabhat, Yunjie Liu, Evan Racah, Joaquin Correa, Amir Khosrowshahi, David Lavers, Kenneth Kunkel, Michael Wehner, William D. Collins, "Deep Learning for Climate Pattern Detection", American Geophysical Union Meeting 2015, December 8, 2015,

"Deep Learning for Science", Prabhat, Kris Bouchard, Wahid Bhimji, Evan Racah, NERSC Science Highlight, December 8, 2015,

Evan Racah, Silvia Crivelli, Yushu Yao, "Machine Learning with Spark: Exploring MLlib Random Forests Performance on Edison", BIDS Data Science Faire, May 15, 2015,

Eric Roman

2015

Brian Austin, Eric Roman, Xiaoye Sherry Li, "Resilient Matrix Multiplication of Hierarchical Semi-Separable Matrices", Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, Portland, OR, June 15, 2015,

Massimiliano Albanese, Michael Berry, David Brown, Scott Campbell, Stephen Crago, George Cybenko, Jon DeLapp, Christopher L. DeMarco, Jeff Draper, Manuel Egele, Stephan Eidenbenz, Tina Eliassi-Rad, Vergle Gipson, Ryan Goodfellow, Paul Hovland, Sushil Jajodia, Cliff Joslyn, Alex Kent, Sandy Landsberg, Larry Lanes, Carolyn Lauzon, Steven Lee, Sven Leyffer, Robert Lucas, David Manz, Celeste Matarazzo, Jackson R. Mayo, Anita Nikolich, Masood Parvania, Garrett Payer, Sean Peisert, Ali Pinar, Thomas Potok, Stacy Prowell, Eric Roman, David Sarmanian, Dylan Schmorrow, Chris Strasburg, V.S. Subrahmanian, Vipin Swarup, Brian Tierney, Von Welch, "ASCR Cybersecurity for Scientific Computing Integrity", DOE Workshop Report, January 7, 2015,

At the request of the U.S. Department of Energy’s (DOE) Advanced Scientific Computing Research (ASCR) program, a workshop was held January 7–9, 2015, in Rockville, Md., to examine computer security research gaps and approaches for assuring scientific computing integrity specific to the mission of the DOE Office of Science. Issues included research computation and simulation that takes place on ASCR computing facilities and networks, as well as network-connected scientific instruments, such as those run by other DOE Office of Science programs. Workshop participants included researchers and operational staff from DOE national laboratories, as well as academic researchers and industry experts. Participants were selected based on the prior submission of abstracts relating to the topic. Additional input came from previous DOE workshop reports [DOE08,BB09] relating to security. Several observers from DOE and the National Science Foundation also attended.

2014

Alex Druinsky, Brian Austin, Xiaoye Sherry Li, Osni Marques, Eric Roman, Samuel Williams, "A Roofline Performance Analysis of an Algebraic Multigrid PDE Solver", SC14, November 2014,

2011

Khaled Z. Ibrahim, S. Hofmeyr, Eric Roman, "Optimized Pre-Copy Live Migration for Memory Intensive Applications", The International Conference for High Performance Computing, Networking, Storage, and Analysis, 2011,

Melissa Romanus

2016

Debbie Bard, Wahid Bhimji, David Paul, Glenn K Lockwood, Nicholas J Wright, Katie Antypas, Prabhat Prabhat, Steve Farrell, Andrey Ovsyannikov, Melissa Romanus, others, "Experiences with the Burst Buffer at NERSC", Supercomputing Conference, November 16, 2016, LBNL LBNL-1007120,

Andrey Ovsyannikov, Melissa Romanus, Brian Van Straalen, Gunther H Weber, David Trebotich, "Scientific workflows at datawarp-speed: Accelerated data-intensive science using nersc's burst buffer", 2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS), Salt Lake City, UT, USA, IEEE, November 14, 2016, 1--6, LBNL LBNL-1006680, doi: 10.1109/PDSW-DISCS.2016.005

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Hongzhang Shan

2016

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, MPI usage at NERSC: Present and Future, EuroMPI 2016, September 26, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, "MPI usage at NERSC: Present and Future", EuroMPI 2016, Edinburgh, Scotland, UK, September 26, 2016,

Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan, "MPI usage at NERSC: Present and Future", EuroMPI 2016, September 26, 2016,

2013

Hongzhang Shan, Brian Austin, Wibe De Jong, Leonid Oliker, Nicholas Wright, Edoardo Apra, "Performance Tuning of Fock Matrix and Two-Electron Integral Calculations for NWChem on Leading HPC Platforms", SC'13, November 11, 2013,

2012

Hongzhang Shan, Brian Austin, Nicholas Wright, Erich Strohmaier, John Shalf, Katherine Yelick, "Accelerating Applications at Scale Using One-Sided Communication", The 6th Conference on Partitioned Global Address Programming Models, Santa Barbara, CA, October 10, 2012,

Hongzhang Shan, J. Wright, Shalf, A. Yelick, Wagner, Nathan Wichmann, "A preliminary evaluation of the hardware acceleration of Cray Gemini interconnect for PGAS languages and comparison with MPI", SIGMETRICS Performance Evaluation Review, 2012, 40:92-98,

2010

Hongzhang Shan, Haoqiang Jin, Karl Fuerlinger, Alice Koniges, Nicholas J Wright, "Analyzing the effect of different programming models upon performance and memory usage on cray xt5 platforms", CUG2010, Edinburgh, Scotland, 2010,

2009

Zhengji Zhao, Juan Meza, Byounghak Lee, Hongzhang Shan, Erich Strohmaier, David Bailey, "Linearly Scaling 3D Fragment Method for Large-Scale Electronic Structure Calculations", 2009 J. Phys.: Conf. Ser. 180 012079, July 1, 2009,

2008

Lin-Wang Wang, Byounghak Lee, Hongzhang Shan, Zhengji Zhao, Juan Meza, Erich Strohmaier, David Bailey,, "Linearly Scaling 3D Fragment Method for Large-Scale Electronic Structure Calculations", An award winning paper (ACM Gordon Bell Prize for algorithm innovation in SC08), Proceedings of the 2008 ACM/IEEE conference on Supercomputing, Article No. 65 (2008)., November 20, 2008,

H. Shan, K. Antypas, J.Shalf., "Characterizing and Predicting the I/O Performance of HPC Applications Using a Parameterized Synthetic Benchmark.", Supercomputing, Reno, NV, November 17, 2008,

John Shalf, Honzhang Shan, Katie Antypas, I/O Requirements for HPC Applications, talk, January 1, 2008,

2007

Jonathan Carter, Yun (Helen) He, John Shalf, Hongzhang Shan, Erich Strohmaier, and Harvey Wasserman, "The Performance Effect of Multi-Core on Scientific Applications", Cray User Group 2007, May 2007, LBNL 62662,

The historical trend of increasing single CPU performance has given way to roadmap of increasing core count. The challenge of effectively utilizing these multi- core chips is just starting to be explored by vendors and application developers alike. In this study, we present some performance measurements of several complete scientific applications on single and dual core Cray XT3 and XT4 systems with a view to characterizing the effects of switching to multi-core chips. We consider effects within a node by using applications run at low concurrencies, and also effects on node- interconnect interaction using higher concurrency results. Finally, we construct a simple performance model based on the principle on-chip shared resource—memory bandwidth—and use this to predict the performance of the forthcoming quad-core system.

 

Jonathan Carter, Helen He*, John Shalf, Erich Strohmaier, Hongzhang Shan, and Harvey Wasserman, The Performance Effect of Multi-Core on Scientific Applications, Cray User Group 2007, May 2007,

J. Levesque, J. Larkin, M. Foster, J. Glenski, G. Geissler, S. Whalen, B. Waldecker, J. Carter, D. Skinner, H. He, H. Wasserman, J. Shalf, H. Shan, "Understanding and mitigating multicore performance issues on the AMD opteron architecture", March 1, 2007, LBNL 62500,

Over the past 15 years, microprocessor performance has doubled approximately every 18 months through increased clock rates and processing efficiency. In the past few years, clock frequency growth has stalled, and microprocessor manufacturers such as AMD have moved towards doubling the number of cores every 18 months in order to maintain historical growth rates in chip performance. This document investigates the ramifications of multicore processor technology on the new Cray XT4systems based on AMD processor technology. We begin by walking through the AMD single-core and dual-core and upcoming quad-core processor architectures. This is followed by a discussion of methods for collecting performance counter data to understand code performance on the Cray XT3and XT4systems. We then use the performance counter data to analyze the impact of multicore processors on the performance of microbenchmarks such as STREAM, application kernels such as the NAS Parallel Benchmarks, and full application codes that comprise the NERSC-5 SSP benchmark suite. We explore compiler options and software optimization techniques that can mitigate the memory bandwidth contention that can reduce computing efficiency on multicore processors. The last section provides a case study of applying the dual-core optimizations to the NAS Parallel Benchmarks to dramatically improve their performance.1

 

Hongzhang Shan, John Shalf, Using IOR to Analyze the I/O performance for HPC Platforms, CUG.org, January 1, 2007,

Julian Borrill, Oliker, Shalf, Hongzhang Shan, "Investigation of leading HPC I/O performance using a scientific-application derived benchmark", SC, January 1, 2007, 10,

John Shalf, Honzhang Shan, User Perspective on HPC I/O Requirements, talk, January 1, 2007,

2006

Hongzhang Shan, John Shalf, "Analysis of Parallel IO on Modern HPC Platforms", January 1, 2006,

L. Oliker, S. Kamil, A. Canning, J. Carter, C. Iancu, J. Shalf, H. Shan, D. Skinner, E. Strohmaier, T. Goodale, "Application Scalability and Communication Signatures on Leading Supercomputing Platforms", January 1, 2006,

Shahzeb Siddiqui

2014

Shahzeb Siddiqui, "Automatic Performance Tuning of Parallel and Accelerated Seismic Imaging Kernels", EAGE Workshop on High Performance Computing for Upstream, European Association of Geoscientists & Engineer, September 1, 2014, doi: https://doi.org/10.3997/2214-4609.20141941

Stephen C. Simms

2015

Harold E.B. Dennis, Adam S. Ward, Tyler Balson, Yuwei Li, Robert Henschel, Shawn Slavin, Stephen Simms, Holger Brunst, "High Performance Computing Enabled Simulation of the Food-Water-Energy System: Simulation of Intensively Managed Landscapes", PEARC17, New York, NY, USA, Association for Computing Machinery, 2015, 1--10, doi: 10.1145/3093338.3093381

2013

Michael Kluge, Stephen Simms, Thomas William, Robert Henschel, Andy Georgi, Christian Meyer, Matthias S. Mueller, Craig A. Stewart, Wolfgang Wünsch, Wolfgang E. Nagel, "Performance and quality of service of data and video movement over a 100 Gbps testbed", Future Generation Computer Systems, 2013, 29:230--240, doi: 10.1016/j.future.2012.05.028

2012

Robert Henschel, Stephen Simms, David Hancock, Scott Michael, Tom Johnson, Nathan Heald, Thomas William, Donald Berry, Matt Allen, Richard Knepper, Matthew Davy, Matthew Link, Craig A. Stewart, "Demonstrating lustre over a 100Gbps wide area network of 3,500km", SC '12, Washington, DC, USA, IEEE Computer Society Press, 2012, 1--8, doi: 10.1109/SC.2012.43

Scott Michael, Liang Zhen, Robert Henschel, Stephen Simms, Eric Barton, Matthew Link, "A study of lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency", DIDC '12, New York, NY, USA, Association for Computing Machinery, 2012, 43--52, doi: 10.1145/2286996.2287005

2010

Joshua Walgenbach, Stephen C. Simms, Kit Westneat, Justin P. Miller, "Enabling Lustre WAN for production use on the TeraGrid: a lightweight UID mapping scheme", TG '10, New York, NY, USA, Association for Computing Machinery, 2010, 1--6, doi: 10.1145/1838574.1838593

Scott Michael, Stephen Simms, W. B. Breckenridge, Roger Smith, Matthew Link, "A compelling case for a centralized filesystem on the TeraGrid: enhancing an astrophysical workflow with the data capacitor WAN as a test case", TG '10, New York, NY, USA, Association for Computing Machinery, 2010, 1--7, doi: 10.1145/1838574.1838587

2008

Stephen C. Simms, Craig A. Stewart, Scott D. McCaulay, "Cyberinfrastructure resources for U.S. Scholarship: the TeraGrid", SIGUCCS '08, Pages: 341--344 2008, doi: 10.1145/1449956.1450057

2007

Stephen C. Simms, Gregory G. Pike, S. Teige, Bret Hammond, Yu Ma, Larry L. Simms, C. Westneat, Douglas A. Balog, "Empowering distributed workflow with the data capacitor: maximizing lustre performance across the wide area network", SOCP '07, New York, NY, USA, Association for Computing Machinery, 2007, 53--58, doi: 10.1145/1272457.1272465

2006

Stephen C Simms, Matt Davy, Bret Hammond, Matt Link, Craig Stewart, Randall Bramley, Beth Plale, Dennis Gannon, Mu-Hyun Baik, Scott Teige, John Huffman, Rick McMullen, Doug Balog, Greg Pike, All in a day s work: advancing data-intensive research with the data capacitor, SC '06, Pages: 244--es 2006, doi: 10.1145/1188455.1188711

2004

I. Foster, J. Gieraltowski, S. Gose, N. Maltsev, E. May, A. Rodriguez, D. Sulakhe, A. Vaniachine, J. Shank, S. Youssef, D. Adams, R. Baker, W. Deng, J. Smith, D. Yu, I. Legrand, S. Singh, C. Steenberg, Y. Xia, A. Afaq, E. Berman, J. Annis, L. a. T. Bauerdick, M. Ernst, I. Fisk, L. Giacchetti, G. Graham, A. Heavey, J. Kaiser, N. Kuropatkin, R. Pordes, V. Sekhri, J. Weigand, Y. Wu, K. Baker, L. Sorrillo, J. Huth, M. Allen, L. Grundhoefer, J. Hicks, F. Luehring, S. Peck, R. Quick, S. Simms, G. Fekete, J. vandenBerg, K. Cho, K. Kwon, D. Son, H. Park, S. Canon, K. Jackson, D. E. Konerding, J. Lee, D. Olson, I. Sakrejda, B. Tierney, M. Green, R. Miller, J. Letts, T. Martin, D. Bury, C. Dumitrescu, D. Engh, R. Gardner, M. Mambelli, Y. Smirnov, J. Voeckler, M. Wilde, Y. Zhao, X. Zhao, P. Avery, R. Cavanaugh, B. Kim, C. Prescott, J. Rodriguez, A. Zahn, S. McKee, C. Jordan, J. Prewett, T. Thomas, H. Severini, B. Clifford, E. Deelman, L. Flon, C. Kesselman, G. Mehta, N. Olomu, K. Vahi, K. De, P. McGuigan, M. Sosebee, D. Bradley, P. Couvares, A. De Smet, C. Kireyev, E. Paulson, A. Roy, S. Koranda, B. Moe, B. Brown, P. Sheldon, "The Grid2003 Production Grid: Principles and Practice", IEEE Computer Society, 2004, 236--245, doi: 10.1109/HPDC.2004.36

Peng Wang, George Turner, Daniel A. Lauer, Matthew Allen, Stephen Simms, David Hart, Mary Papakhian, Craig A. Stewart, "LINPACK Performance on a Geographically Distributed Linux Cluster", IPDPS '04, IEEE Computer Society, 2004, 245b--245b, doi: 10.1109/IPDPS.2004.1303301

2001

Craig A. Stewart, Christopher S. Peebles, Mary Papakhian, John Samuel, David Hart, Stephen Simms, "High performance computing: delivering valuable and valued services at colleges and universities", SIGUCCS '01, New York, NY, USA, Association for Computing Machinery, 2001, 266--269, doi: 10.1145/500956.501026

Michael Stewart

2011

P. M. Stewart, Y. He, "Benchmark Performance of Different Compilers on a Cray XE6", Fairbanks, AK, CUG Proceedings, May 23, 2011,

There are four different supported compilers on NERSC's recently acquired XE6, Hopper. Our users often request guidance from us in determining which compiler is best for a particular application. In this paper, we will describe the comparative performance of different compilers on several MPI benchmarks with different characteristics. For each compiler and benchmark, we will establish the best set of optimization arguments to the compiler.

Tavia Stone Gibbins

2005

Baird, W. P., Bethel, W., Carter, J., Siegerist, C., Stone, T., & Wehner, M, "TRI Data Storm", Proceedings from SC'05, November 2005,

Rollin Thomas

2010

Keith Jackson, Lavanya Ramakrishnan, Karl Runge, and Rollin Thomas, "Seeking Supernovae in the Clouds: A Performance Study", HPDC '10: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, ACM, June 2010, 421–429, doi: 10.1145/1851476.1851538

Best Paper, ScienceCloud 2010

Today, our picture of the Universe radically differs from that of just over a decade ago. We now know that the Universe is not only expanding as Hubble discovered in 1929, but that the rate of expansion is accelerating, propelled by mysterious new physics dubbed "Dark Energy." This revolutionary discovery was made by comparing the brightness of nearby Type Ia supernovae (which exploded in the past billion years) to that of much more distant ones (from up to seven billion years ago). The reliability of this comparison hinges upon a very detailed understanding of the physics of the nearby events. As part of its effort to further this understanding, the Nearby Supernova Factory (SNfactory) relies upon a complex pipeline of serial processes that execute various image processing algorithms in parallel on ~10TBs of data.

This pipeline has traditionally been run on a local cluster. Cloud computing offers many features that make it an attractive alternative. The ability to completely control the software environment in a Cloud is appealing when dealing with a community developed science pipeline with many unique library and platform requirements. In this context we study the feasibility of porting the SNfactory pipeline to the Amazon Web Services environment. Specifically we: describe the tool set we developed to manage a virtual cluster on Amazon EC2, explore the various design options available for application data placement, and offer detailed performance results and lessons learned from each of the above design options.

2008

Sarah S. Poon, Rollin C. Thomas, Cecilia R. Aragon, Brian Lee, "Context-linked virtual assistants for distributed teams: an astrophysics case study", CSCW '08: Proceedings of the 2008 ACM conference on Computer supported cooperative work, ACM, November 8, 2008, 361–370, doi: 10.1145/1460563.1460623

Best Paper Honorable Mention, CSCW'08

There is a growing need for distributed teams to analyze complex and dynamic data streams and make critical decisions under time pressure. Via a case study, we discuss potential guidelines for the design of software tools to facilitate such collaborative decision-making. We introduce the term context-linked to characterize systems where both task and context information are included in a shared space. We describe a novel, lightweight, context-linked event notification/virtual assistant system developed to aid a cross-cultural, geographically distributed team of astrophysicists to remotely maneuver a custom-built instrument under challenging operational conditions, where critical decisions must be made in as little as 45 seconds. The system has been in use since 2005 by a major international astrophysics collaboration. We describe the design and implementation of the event notification system and then present a case study, based on event log analysis and user interviews, of its effectiveness in substantially improving user performance during time-critical science tasks. Finally, we discuss the implications of context linking for supporting common ground in distributed teams.

David P. Turner

2014

David Turner, NERSC, Accounts and Allocations, February 3, 2014,

David Turner, NERSC, NERSC Computing Environment, February 3, 2014,

David Turner, NERSC, NERSC File Systems and How to Use Them, February 3, 2014,

Mike L. Welcome

2011

D. Hazen, J. Hick, W. Hurlbert, M. Welcome, Media Information Record (MIR) Analysis, LTUG 2011, April 19, 2011,

Presentation of Storage Systems Group findings from a year-long effort to collect and analyze Media Information Record (MIR) statistics from our in-production Oracle enterprise tape drives at NERSC.  We provide information on the data collected, and some highlights from our analysis. The presentation is primarily intended to declare that the information in the MIR is important to users or customers to better operating and managing their tape environments.

Cary Whitney

2014

K. Antypas, B.A Austin, T.L. Butler, R.A. Gerber, C.L Whitney, N.J. Wright, W. Yang, Z Zhao, "NERSC Workload Analysis on Hopper", Report, October 17, 2014, LBNL 6804E,

Nicholas Wright

2016

C.S. Daley, D. Ghoshal, G.K. Lockwood, S. Dosanjh, L. Ramakrishnan, N.J. Wright, "Performance Characterization of Scientific Workflows for the Optimal Use of Burst Buffers", Workflows in Support of Large-Scale Science (WORKS-2016), CEUR-WS.org, 2016, 1800:69-73,

Shane Snyder, Philip Carns, Kevin Harms, Robert Ross, Glenn K. Lockwood, Nicholas J. Wright, "Modular HPC I/O characterization with Darshan", Proceedings of the 5th Workshop on Extreme-Scale Programming Tools (ESPT'16), Salt Lake City, UT, November 13, 2016, 9-17, doi: 10.1109/ESPT.2016.9

Contemporary high-performance computing (HPC) applications encompass a broad range of distinct I/O strategies and are often executed on a number of different compute platforms in their lifetime. These large-scale HPC platforms employ increasingly complex I/O subsystems to provide a suitable level of I/O performance to applications. Tuning I/O workloads for such a system is nontrivial, and the results generally are not portable to other HPC systems. I/O profiling tools can help to address this challenge, but most existing tools only instrument specific components within the I/O subsystem that provide a limited perspective on I/O performance. The increasing diversity of scientific applications and computing platforms calls for greater flexibility and scope in I/O characterization.

In this work, we consider how the I/O profiling tool Darshan can be improved to allow for more flexible, comprehensive instru- mentation of current and future HPC I/O workloads.We evaluate the performance and scalability of our design to ensure that it is lightweight enough for full-time deployment on production HPC systems. We also present two case studies illustrating how a more comprehensive instrumentation of application I/O workloads can enable insights into I/O behavior that were not previously possible. Our results indicate that Darshan’s modu- lar instrumentation methods can provide valuable feedback to both users and system administrators, while imposing negligible overheads on user applications.

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, others, "Accelerating science with the NERSC burst buffer early user program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

2015

C.S. Daley, L. Ramakrishnan, S. Dosanjh, N.J. Wright, "Analyses of Scientific Workflows for Effective Use of Future Architectures", The 6th International Workshop on Big Data Analytics: Challenges, and Opportunities (BDAC-15), 2015,

N.J. Wright, S. S. Dosanjh, A. K. Andrews, K. Antypas, B. Draney, R.S Canon, S. Cholia, C.S. Daley, K. M. Fagnan, R.A. Gerber, L. Gerhardt, L. Pezzaglia, Prabhat, K.H. Schafer, J. Srinivasan, "Cori: A Pre-Exascale Computer for Big Data and HPC Applications", Big Data and High Performance Computing 26 (2015): 82., ( June 2015) doi: 10.3233/978-1-61499-583-8-82

Extreme data science is becoming increasingly important at the U.S. Department of Energy's National Energy Research Scientific Computing Center (NERSC). Many petabytes of data are transferred from experimental facilities to NERSC each year. Applications of importance include high-energy physics, materials science, genomics, and climate modeling, with an increasing emphasis on large-scale simulations and data analysis. In response to the emerging data-intensive workloads of its users, NERSC made a number of critical design choices to enhance the usability of its pre-exascale supercomputer, Cori, which is scheduled to be delivered in 2016. These data enhancements include a data partition, a layer of NVRAM for accelerating I/O, user defined images and a customizable gateway for accelerating connections to remote experimental facilities.

2014

Yu Jung Lo, Samuel Williams, Brian Van Straalen, Terry J. Ligocki,Matthew J. Cordery, Nicholas J. Wright, Mary W. Hall, Leonid Oliker, "Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis", SC'14, November 16, 2014,

Brian Austin, Nicholas Wright, "Measurement and interpretation of microbenchmark and application energy use on the Cray XC30", Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, November 2014,

K. Antypas, B.A Austin, T.L. Butler, R.A. Gerber, C.L Whitney, N.J. Wright, W. Yang, Z Zhao, "NERSC Workload Analysis on Hopper", Report, October 17, 2014, LBNL 6804E,

M. J. Cordery, B. Austin, H. J. Wasserman, C. S. Daley, N. J. Wright, S. D. Hammond, D. Doerfler, "Analysis of Cray XC30 Performance using Trinity-NERSC-8 benchmarks and comparison with Cray XE6 and IBM BG/Q", High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation (PMBS 2013). Lecture Notes in Computer Science, Volume 8551, October 1, 2014,

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

2013

Hongzhang Shan, Brian Austin, Wibe De Jong, Leonid Oliker, Nicholas Wright, Edoardo Apra, "Performance Tuning of Fock Matrix and Two-Electron Integral Calculations for NWChem on Leading HPC Platforms", SC'13, November 11, 2013,

Zhengji Zhao, Katie Antypas, Nicholas J Wright, "Effects of Hyper-Threading on the NERSC workload on Edison", 2013 Cray User Group Meeting, May 9, 2013,

Brian Austin, Matthew Cordery, Harvey Wasserman, Nicholas J. Wright, "Performance Measurements of the NERSC Cray Cascade System", 2013 Cray User Group Meeting, May 9, 2013,

Nick Wright, NERSC Initiative: Preparing Applications for Exascale, February 12, 2013,

Andrew Uselton, Nicholas J. Wright, "A file system utilization metric for I/O characterization", 2013 Cray User Group Conference, Napa, CA, 2013,

Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane Canon, Anping Liu, Devarshi Ghoshal, Krishna Muriki, Nicholas J. Wright, "Magellan - A Testbed to Explore Cloud Computing for Science", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: 2013)

Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane Canon, Anping Liu, Devarshi Ghoshal, Krishna Muriki and Nicholas J. Wright, "CAMP", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: January 1, 2013)

2012

Hongzhang Shan, Brian Austin, Nicholas Wright, Erich Strohmaier, John Shalf, Katherine Yelick, "Accelerating Applications at Scale Using One-Sided Communication", The 6th Conference on Partitioned Global Address Programming Models, Santa Barbara, CA, October 10, 2012,

Hongzhang Shan, J. Wright, Shalf, A. Yelick, Wagner, Nathan Wichmann, "A preliminary evaluation of the hardware acceleration of Cray Gemini interconnect for PGAS languages and comparison with MPI", SIGMETRICS Performance Evaluation Review, 2012, 40:92-98,

Lavanya Ramakrishnan, Richard Canon, Muriki, Sakrejda, Nicholas J. Wright, "Evaluating Interconnect and Virtualization Performance forHigh Performance Computing", SIGMETRICS Performance Evaluation Review, 2012, 40:55-60,

2011

Lavanya Ramakrishnan, Richard Shane Canon, Krishna Muriki, Iwona Sakrejda, and Nicholas J. Wright., "Evaluating Interconnect and Virtualization Performance for High Performance Computing", Proceedings of 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS11), 2011,

In this paper we detail benchmarking results that characterize the virtualization overhead and its impact on performance. We also examine the performance of various interconnect technologies with a view to understanding the performance impacts of various choices. Our results show that virtualization can have a significant impact upon performance, with at least a 60% performance penalty. We also show that less capable interconnect technologies can have a significant impact upon performance of typical HPC applications. We also evaluate the performance of the Amazon Cluster compute instance and show that it performs approximately equivalently to a 10G Ethernet cluster at low core counts.

Zhengji Zhao and Nick Wright, "Performance of Density Functional Theory codes on Cray XE6", A paper presented in the Cray User Group meeting, May 23-26, 2011, Fairbanks, Alaska., May 24, 2011,

Zhengji Zhao and Nick Wright, Performance of Density Functional Theory codes on Cray XE6, A talk in Cray User Group meeting 2011, May 23-26, 2011, Fairbanks, Alaska., May 23, 2011,

K. Furlinger, N.J. Wright, D. Skinner, "Comprehensive Performance Monitoring for GPU Cluster Systems", Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on, 2011, 1377--1386,

Praveen Narayanan, Alice Koniges, Leonid Oliker, Robert Preissl, Samuel Williams, Nicholas J Wright, Maxim Umansky, Xueqiao Xu, Benjamin Dudson, Stephane Ethier, Weixing Wang, Jeff Candy, John R. Cary, "Performance Characterization for Fusion Co-design Applications", Proceedings of CUG, 2011,

2010

Neal Master, Matthew Andrews, Jason Hick, Shane Canon, Nicholas J. Wright, "Performance Analysis of Commodity and Enterprise Class Flash Devices", Petascale Data Storage Workshop (PDSW), November 2010,

Hongzhang Shan, Haoqiang Jin, Karl Fuerlinger, Alice Koniges, Nicholas J Wright, "Analyzing the effect of different programming models upon performance and memory usage on cray xt5 platforms", CUG2010, Edinburgh, Scotland, 2010,

Keith R. Jackson, Ramakrishnan, Muriki, Canon, Cholia, Shalf, J. Wasserman, Nicholas J. Wright, "Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud", CloudCom, January 1, 2010, 159-168,

Andrew Uselton, Howison, J. Wright, Skinner, Keen, Shalf, L. Karavanic, Leonid Oliker, "Parallel I/O performance: From events to ensembles", IPDPS, 2010, 1-11,

Karl F\ urlinger, J. Wright, David Skinner, "Effective Performance Measurement at Petascale Using IPM", ICPADS, January 1, 2010, 373-380,

K. Fuerlinger, N.J. Wright, D. Skinner, "Performance analysis and workload characterization with ipm", Tools for High Performance Computing 2009, January 1, 2010, 31--38,

K. Fuerlinger, N.J. Wright, D. Skinner, C. Klausecker, D. Kranzlmueller, "Effective Holistic Performance Measurement at Petascale Using IPM", Competence in High Performance Computing 2010, January 1, 2010, 15--26,

2009

B. R. de Supinski, S. Alam, D. H. Bailey, L., C. Daley, A. Dubey, T., D. Gunter, P. D. Hovland, H., K. Karavanic, G. Marin, J., S. Moore, B. Norris, L., C. Olschanowsky, P. C. Roth, M., S. Shende, A. Snavely, Spear, M. Tikir, J. Vetter, P. Worley, N. Wright, "Modeling the Office of Science ten year facilities plan: The PERI Architecture Tiger Team", Journal of Physics: Conference Series, 2009, 180:012039,

N.J. Wright, S. Smallen, C.M. Olschanowsky, J. Hayes, A. Snavely, "Measuring and Understanding Variation in Benchmark Performance", DoD High Performance Computing Modernization Program Users Group Conference (HPCMP-UGC), 2009, 2009, 438 -443,

2008

Wayne Pfeiffer, Nicholas J. Wright, "Modeling and predicting application performance on parallel computers using HPC challenge benchmarks", IPDPS, 2008, 1-12,

2007

John Michalakes, Hacker, Loft, O. McCracken, Snavely, J. Wright, E. Spelce, C. Gorda, Robert Walkup, "WRF nature run", SC, 2007, 59,

Charlene J. Yang

2016

C. Yang, S. Yazar, G. Gooden, and A. Hewitt, "Data-Driven Workflows on Crays with Hybrid Scheduling: A Case Study of Celera on Magnus", Supercomputing Conference (SC'16), November 2016,

2015

C. J. Yang, C. Harris, S. Young, and G. Morahan, "Adapting Genome-Wide Association Workflows for HPC Processing at Pawsey", Supercomputing Conference (SC'15), November 2015,

C. Yang, Programming for Bioinformatics on the Supercomputer Magnus at Pawsey, 11th GeneMappers Conference, November 2015,

C. Yang, Managing Bioinformatics Software Stack on Magnus, Pawsey Supercomputing Centre Bioinformatics Expo, April 2015,

2014

C. J. Yang, Q. Guo, D. Huang, and S. Nordholm, "Exploiting Cyclic Prefix for Joint Detection, Decoding and Channel Estimation in OFDM via EM Algorithm and Message Passing", IEEE International Conference on Communications (ICC'14), pp. 4674-4679, June 2014,

2013

C. J. Yang, Q. Guo, D. Huang, and S. Nordholm, "Enhanced Data Detection in OFDM Systems using Factor Graph", IEEE International Conference on Wireless Communications and Signal Processing (WCSP'13), pp. 1-5, October 2013,

C. J. Yang, Q. Guo, D. Huang, and S. Nordholm, "A Factor Graph Approach to Exploiting Cyclic Prefix for Equalization in OFDM systems", IEEE Transactions on Communications, vol. 61, no. 12, pp. 4972-4983, June 2013,

C. J. Yang, Q. Guo, D. Huang, and S. Nordholm, "Exploiting Cyclic Prefix in Turbo FDE Systems using Factor Graph", IEEE Wireless Communications and Networking Conference (WCNC'13), pp. 2536-2541, April 2013,

Woo-Sun Yang

2015

Jack Deslippe, Brian Austin, Chris Daley, Woo-Sun Yang, "Lessons learned from optimizing science kernels for Intel's "Knights-Corner" architecture", CISE, April 1, 2015,

2014

K. Antypas, B.A Austin, T.L. Butler, R.A. Gerber, C.L Whitney, N.J. Wright, W. Yang, Z Zhao, "NERSC Workload Analysis on Hopper", Report, October 17, 2014, LBNL 6804E,

Richard A. Gerber, Helen He, Woo-Sun Yang, Debugging and Optimization Tools, Presented at UC Berkeley CS267 class, February 2014, February 19, 2014,

Woo-Sun Yang, Debugging Tools, February 3, 2014,

2013

Woo-Sun Yang, Debugging and Performance Analysis Tools at NERSC, BOUT++ 2013 Workshop, September 3, 2013,

2010

Wendy Hwa-Chun Lin, Yun (Helen) He, and Woo-Sun Yang, "Franklin Job Completion Analysis", Cray User Group 2010 Proceedings, Edinburgh, UK, May 2010,

The NERSC Cray XT4 machine Franklin has been in production for 3000+ users since October 2007, where about 1800 jobs run each day. There has been an on-going effort to better understand how well these jobs run, whether failed jobs are due to application errors or system issues, and to further reduce system related job failures. In this paper, we talk about the progress we made in tracking job completion status, in identifying job failure root cause, and in expediting resolution of job failures, such as hung jobs, that are caused by system issues. In addition, we present some Cray software design enhancements we requested to help us track application progress and identify errors.

 

Yun (Helen) He, Wendy Hwa-Chun Lin, and Woo-Sun Yang, Franklin Job Completion Analysis, Cray User Group Meeting 2010, May 2010,

Yushu Yao

2016

Alex Gittens, Jey Kottalam, Jiyan Yang, Michael F Ringenburg, Jatin Chhugani, Evan Racah, Mohitdeep Singh, Yushu Yao, Curt Fischer, Oliver Ruebel, Benjamin Bowen, Norman Lewis, Michael W Mahoney, Venkat Krishnamurthy, Prabhat, "A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark", The 5th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics, IPDPS, February 1, 2016,

2015

Huong Luu, Marianne Winslett, William Gropp, Kevin Harms, Phil Carns, Robert Ross, Yushu Yao, Suren Byna, Prabhat, "A Multi-platform Study of I/O Behavior on Petascale Supercomputers", HPDC 2015, June 9, 2015,

Alice Koniges, Shreyas Cholia, Prabhat, Yushu Yao, "Data and Workflow Solutions for Fusion using NERSC", DOE FES/ASCR Workshop on Integrated Simulations for Magnetic Fusion Energy Sciences, June 2, 2015,

Yushu Yao, SciDB @ NERSC, March 23, 2015,

2014

Yushu Yao, NERSC; Douglas Jacobsen, NERSC, Connecting to NERSC, NUG 2014, February 3, 2014,

2013

A. Koniges, R. Gerber, D. Skinner, Y. Yao, Y. He, D. Grote, J-L Vay, H. Kaiser, and T. Sterling, "Plasma Physics Simulations on Next Generation Platforms", 55th Annual Meeting of the APS Division of Plasma Physics, Volume 58, Number 16, November 11, 2013,

The current high-performance computing revolution provides opportunity for major increases in computational power over the next several years, if it can be harnessed. This transition from simply increasing the single-processor and network performance to a different architectural paradigms forces application programmers to rethink the basic models of parallel programming from both the language and problem division standpoints. One of the major computing facilities available to researchers in fusion energy is the National Energy Research Scientific Computing Center. As the mission computing center for DOE, Office of Science, NERSC is tasked with helping users to overcome the challenges of this revolution both through the use of new parallel constructs and languages and also by enabling a broader user community to take advantage of multi-core performance. We discuss the programming model challenges facing researchers in fusion and plasma physics in for a variety of simulations ranging from particle-in-cell to fluid-gyrokinetic and MHD models.

Yushu Yao, NERSC Parallel Database Evaluation, February 12, 2013,

2012

Zhengji Zhao, Mike Davis, Katie Antypas, Yushu Yao, Rei Lee and Tina Butler, "Shared Library Performance on Hopper", A paper presented in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,

Zhengji Zhao, Mike Davis, Katie Antypas, Yushu Yao, Rei Lee and Tina Butler, Shared Library Performance on Hopper, A talk in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,

Babak Behzad, Joseph Huchette, Huong Luu, Suren Byna, Yushu Yao, Prabhat, "Auto-Tuning of Parallel I/O Parameters for HDF5 Applications", SC, 2012,

Surendra Byna, Jerry Chou, Oliver R\ ubel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, Kesheng Wu, "Parallel Data, Analysis, and Visualization of a Trillion Particles", XLDB, 2012,

Mehmet Balman, Eric Pouyoul, Yushu Yao, Loring E. Wes Bethel, Prabhat, John Shalf, Alex Sim, Brian L. Tierney, "Experiences with 100G Network Applications", Proceedings of the 5th International Workshop on Data Intensive and Distributed Computing (DIDC 2012), Delft, Netherlands, 2012,

Katherine A. Yelick

2015

Scott French, Yili Zheng, Barbara Romanowicz, Katherine Yelick, "Parallel Hessian Assembly for Seismic Waveform Inversion Using Global Updates", IEEE International Parallel & Distributed Processing Symposium (IPDPS) 2015, May 25, 2015, doi: 10.1109/IPDPS.2015.58

2013

Richard Gerber, Kathy Yelick, Lawrence Berkeley National Laboratory, "Data Requirements from NERSC Requirements Reviews", January 9, 2013,

2012

Alice Koniges, Katherine Yelick, Rolf Rabenseifner, Reinhold Bader, David Eder, Filip Blagojevic, Robert Preissl, Paul Hargrove, Introduction to PGAS (UPC and CAF) and Hybrid for Multicore Programming, SC12 Full Day Tutorial, November 2012,

Hongzhang Shan, Brian Austin, Nicholas Wright, Erich Strohmaier, John Shalf, Katherine Yelick, "Accelerating Applications at Scale Using One-Sided Communication", The 6th Conference on Partitioned Global Address Programming Models, Santa Barbara, CA, October 10, 2012,

NERSC Accomplishments and Plans, February 3, 2012,

Hongzhang Shan, J. Wright, Shalf, A. Yelick, Wagner, Nathan Wichmann, "A preliminary evaluation of the hardware acceleration of Cray Gemini interconnect for PGAS languages and comparison with MPI", SIGMETRICS Performance Evaluation Review, 2012, 40:92-98,

2011

Katherine Yelick, Susan Coghlan, Brent Draney, Richard Shane Canon, Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Anping Liu, Scott Campbell, Piotr T. Zbiegiel, Tina Declerck, Paul Rich, "The Magellan Report on Cloud Computing for Science", U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), December 2011,

Saman Amarasinghe, Mary Hall, Richard Lethin, Keshav Pingali, Dan Quinlan, Vivek Sarkar, John Shalf, Robert Lucas, Katherine Yelick, Pavan Balaji, Pedro C. Diniz, Alice Koniges, Marc Snir, Sonia R. Sachs, "Exascale Programming Challenges", 2011,

J. Dongarra, P. Beckman, T. Moore, P. Aerts, G. Aloisio, J.C. Andre, D. Barkai, J.Y. Berthou, T. Boku, B. Braunschweig, others, "The international exascale software project roadmap", International Journal of High Performance Computing Applications, January 2011, 25:3--60,

2010

Jack Dongarra, John Shalf, David Skinner, Kathy Yelick, "International Exascale Software Project (IESP) Roadmap, version 1.1", October 18, 2010,

K. Datta, S. Williams, V. Volkov, J. Carter, L. Oliker, J. Shalf, K. Yelick, "Auto-Tuning Stencil Computations on Diverse Multicore Architectures", Scientific Computing with Multicore and Accelerators, edited by Jakub Kurzak, David A. Bader, Jack Dongarra, 2010,

2009

Joseph Gebis, Oliker, Shalf, Williams, Katherine A. Yelick, "Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture", ARCS, January 1, 2009, 146-158,

Kamesh Madduri, Williams, Ethier, Oliker, Shalf, Strohmaier, Katherine A. Yelick, "Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors", SC, January 1, 2009,

Samuel Williams, Carter, Oliker, Shalf, Katherine A. Yelick, "Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms", J. Parallel Distrib. Comput., January 1, 2009, 69:762-777,

2008

Sam Williams, Jonathan Carter, Leonid Oliker, John Shalf, Katherine Yelick, Lattice Boltzmann Simulation Optimization on Leading Multicore Architectures, January 1, 2008,

Sam Williams, Kaushik Datta, Jonathan Carter, Leonid Oliker, John Shalf, Katherine Yelick, David Bailey, "PERI - Auto-tuning Memory Intensive Kernels for Multicore", SciDAC: Scientific Discovery Through Advanced Computing, Journal of Physics: Conference Series, January 1, 2008,

Samuel Williams, Carter, Oliker, Shalf, Katherine A. Yelick, "Lattice Boltzmann simulation optimization on leading multicore platforms", IPDPS, January 1, 2008, 1-14,

Kaushik Datta, Murphy, Volkov, Williams, Carter, Oliker, A. Patterson, Shalf, Katherine A. Yelick, "Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures", SC, January 1, 2008, 4,

2007

S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, J. Demmel, Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms, International Conference for High-Performance Computing, Networking, Storage, and Analysis, January 1, 2007,

Samuel Williams, Oliker, W. Vuduc, Shalf, A. Yelick, James Demmel, "Optimization of sparse matrix-vector multiplication on emerging multicore platforms", SC, January 1, 2007, 38,

Samuel Williams, Shalf, Oliker, Kamil, Husbands, Katherine A. Yelick, "Scientific Computing Kernels on the Cell Processor", International Journal of Parallel Programming, January 1, 2007, 35:263-298,

2006

Jonathan Carter, Tony Drummond, Parry Husbands, Paul Hargrove, Bill Kramer, Osni Marques, Esmond Ng, Lenny Oliker, John Shalf, David Skinner, Kathy Yelick, "Software Roadmap to Plug and Play Petaflop/s", Lawrence Berkeley National Laboratory Technical Report, #59999, July 31, 2006,

Samuel Williams, Shalf, Oliker, Kamil, Husbands, Katherine A. Yelick, "The potential of the cell processor for scientific computing", Conf. Computing Frontiers, January 1, 2006, 9-20,

Shoaib Kamil, Datta, Williams, Oliker, Shalf, Katherine A. Yelick, "Implicit and explicit optimizations for stencil computations", Memory System Performance and Correctness, January 1, 2006, 51-60,

2005

John Shalf, John Bell, Andrew Canning, Lin-Wang Wang, Juan Meza, Rob Ryne, Ji Qiang, Kathy Yelick, "Berkeley Petascale Applications", January 1, 2005,

Horst D. Simon, William T. C. Kramer, David H. Bailey, Michael J. Banda, E. Wes Bethel, Jonathon T. Carter, James M. Craw, William J. Fortney, John A. Hules, Nancy L. Meyer, Juan C. Meza, Esmond G. Ng, Lynn E. Rippe, William C. Saphir, Francesca Verdier, Howard A. Walter, Katherine A. Yelick, "Science-Driven Computing: NERSC’s Plan for 2006–2010", LBNL Technical Report 57582, 2005,

Tarek El-Ghazawi, William Carlson, Thomas Sterling, Katherine Yelick, UPC: Distributed Shared-Memory Programming, January 1, 2005,

Shoaib Kamil, Husbands, Oliker, Shalf, Katherine A. Yelick, "Impact of modern memory subsystems on cache optimizations for stencil computations", Memory System Performance, January 1, 2005, 36-43,

R. Vuduc, J.W. Demmel, K.A. Yelick, "OSKI: A library of automatically tuned sparse matrix kernels", Journal of Physics: Conference Series, January 1, 2005, 16:521,

2004

Al Trivelpiece, Rupak Biswas, Jack Dongarra, Peter Paul, Katherine Yelick, Assessment of High-End Computing Research and Development in Japan, Available from http://www.wtec.org/reports.htm, January 1, 2004,

Xuefei Yuan

2013

Alice Koniges, Praveen Narayanan, Robert Preissl, Xuefei Yuan, Proxy Design and Optimization in Fusion and Accelerator Physics, SIAM Conference on Computational Science and Engineering, February 25, 2013,

Xuefei Yuan, Xiaoye S Li, Ichitaro Yamazaki, Stephen C Jardin, Alice E Koniges, David E Keyes, "Application of PDSLin to the magnetic reconnection problem", Computational Science & Discovery, 2013, 6:014002,

Zhengji Zhao

2016

Hongzhang Shan, Samuel Williams, Yili Zheng, Weiqun Zhang, Bei Wang, Stephane Ethier, Zhengji Zhao, "Experiences of Applying One-Sided Communication to Nearest-Neighbor Communication", http://conferences.computer.org/paw/2016/, Salt Lake City, Utah, November 14, 2016,

Zhengji Zhao, and Martijn Marsman, "Estimating the Performance Impact of the MCDRAM on KNL Using Dual-Socket Ivy Bridge nodes on Cray XC30", https://cug.org/, London, UK, May 11, 2016,

2015

Luther Martin, and Zhengji Zhao, Optimization Strategies for Materials Science Applications on Cori: An Intel Knights Landing, Many Integrated Core Architecture, ACM Student Research Competition in SC15, (Luther Martin won a Silver Medal), http://sc15.supercomputing.org/, November 15, 2015,

Zhengji Zhao, Scott French, Jack Deslippe, Mathias Jacquelin, Brian Friesen, and Helen He, Application Readiness for NERSC Cori, Berkeley Lab CS Strategic Review 2015, September 9, 2015,

Zhengji Zhao, Using VASP at NERSC, NERSC User Training, Oakland, CA, June 5, 2015,

Zhengji Zhao, Intel Tools for Optimizations, A talk given at Cray Quarterly Business Review Meeting, St. Paul, MN, April 8, 2015,

Zhengji Zhao, A Burst Buffer Use Case, Cray Quarterly Business Review Meeting, Oakland, CA, January 28, 2015,

2014

Zhengji Zhao, Cori and NERSC Exascale Application Program (NESAP), NWChem Developers Workshop at Seattle, WA, October 28, 2014,

K. Antypas, B.A Austin, T.L. Butler, R.A. Gerber, C.L Whitney, N.J. Wright, W. Yang, Z Zhao, "NERSC Workload Analysis on Hopper", Report, October 17, 2014, LBNL 6804E,

Zhengji Zhao, Automatic Library Tracking Database at NERSC, ALTD Review Meeting at NERSC, Oakland CA, October 8, 2014,

Zhengji Zhao, Quality and Testing of PE releases, Cray Quarterly Business Meeting, Oakland CA, July 30, 2014,

Zhengji Zhao, Doug Petesch, David Knaak, and Tina Declerck, "I/O Performance on Cray XC30", Cray User Group Meeting, May 4, 2014,

Zhengji Zhao, Doug Petesch, David Knaak, and Tina Declerck, I/O Performance on Cray XC30, Cray User Group Meeting, May 4-8, 2014, Lugano, Switzerland, May 4, 2014,

Zhengji Zhao, NERSC, Best Practices for Best Performance on Edison, February 6, 2014,

Zhengji Zhao, NERSC, Available Software at NERSC, February 3, 2014,

2013

Richard A Gerber, Zhengji Zhao, NERSC Job Data, November 20, 2013,

Zhengji Zhao, Process and Thread Affinity with OpenMP, NERSC User Training on Edison Performance, Oakland, CA, October 10, 2013,

Zhengji Zhao, Edison Phase I Early User Science Results, A brown bag talk at NERSC, Oakland CA; Cray Quarterly Business Review Meeting, St. Paul, MN, July 22, 2013,

Zhengji Zhao, Katie Antypas, Nicholas J Wright, "Effects of Hyper-Threading on the NERSC workload on Edison", 2013 Cray User Group Meeting, May 9, 2013,

Richard A. Gerber, Tina Declerck. Zhengji Zhao, Edison Update, February 12, 2013,

Overview and update on the installation and configuration of Edison, NERSC's new Cray XC30 supercomputer.

Jack Deslippe, Zhengji Zhao, "Comparing Compiler and Library Performance in Material Science Applications on Edison", Paper. Proceedings of the Cray User Group 2013, 2013,

2012

Zhengji Zhao, Megan Bowling, and Jack Deslippe, Using Cray Compilers at NERSC - Usability and Performance, Cray Quarterly Business Review Meeting, Oakland, CA, July 25, 2012,