NERSCPowering Scientific Discovery Since 1974

DOE High Performance Computing Operational Review (HPCOR)

All logos

Enabling Data-Driven Scientific Discovery at HPC Facilities

June 18-19, 2014
California State University, East Bayh
Oakland Professional Development and Conference Center
Trans Bay Center
1000 Broadway, Suite 109
Oakland, CA


The DOE High Performance and Computing Operational Review (HPCOR) covered processes and practices for delivering facilities and services that enable high performance data-driven scientific discovery at the DOE national laboratories. 

The meeting was held in conjunction with the Joint Facilities User Forum on Data-Intensive Computing user-focused event June 16-18. 2014. Experiences and lessons learned from recent data initiatives and systems deployed were reviewed to benefit the deployment of new capabilities. The results of this review is the  written report that was delivered to NNSA/ASC and SC/ASCR HQ containing best practices and recommendation for supporting data-intensive science at DOE HPC Centers This meeting is one in a series of HPCOR meetings, the last of which was held in November 2013.

The subject experts attending this meeting were from the six DOE laboratories involved in collaborative procurements for the next generation of HPC systems, NASA, DoD, NOAA, and NSF-funded organizations. Supporting science in the era of Big Data presents new challenges for HPC centers; understanding the needs of scientists and the best practices of other laboratories will be beneficial to all.



High Performance Computing Operational Review: Enabling Data-Driven Scientific Discovery at DOE HPC Facilities


Science Drivers and Best Practices - June 17

(Oakland City Center Conference Center, Paramount Room, 500 12th Street, Suite 105, Oakland CA Website )

Application scientists working in data-intensive fields will share their successes, challenges, and lessons learned. 

Breakout Sessions - June 18-19

Discussions are focused around eight topics, four per day running concurrently. Each session has a title topic (in bold) and a set of questions particular to that topic. These should be addressed in conjunction with the Generic Breakout Session Questions (see below). Session co-chairs can choose to emphasize certain questions over others. 

[Session Charge Presentation]

Day 1, June 18 (All 4 sessions run concurrently)

  1. System Configuration (Session D1SA, Room 2): What are the hardware characteristics of a good data analytics system, including compute and storage? What does it look like? Should an HPC system and a data system be the same or different? What percentage of resources should be allocated to compute vs. data/I-O? What storage technologies and tools are being used and which new ones are being considered?
    Participants (12): Andrew Cherry, Cory Lueninghoener, Curt Canada, Robin Goldstone, Jim Silva, Jason Hick, Nick Wright, Clay England, Chris Beggio, Bob Ballance, Glenn Lockwood, Eli Dart. Co-Chairs: Jason Hick, Clay England
  2. Visualization/in-situ analysis (Session D1SB, Room 3): What is needed to support in-situ analysis and visualization?  From hardware, software and support perspectives? What visualization facilities and capabilities do you support for both local and remote users? 
    Participants (10): (Kevin Harms, Laura Monroe, Bob Kares, Ming Jiang, Jeff Long, Wes Bethel, Prabhat, Doug Fuller, David Karelitz, Andy Wilson. Co-Chairs: Prabhat, David Karelitz. Scribe: Laura Monroe.
  3. Data Management Policies (Session D1SC, Room 1): What facilities and policies are in place for data retention and access?  What are the challenges and possible solutions?  Will centers be part of scientists' Data Management Plans? If so, how?  What standards for data repositories and archives are in place and which ones do you plan to support? How is access to the broader community provided? How do you balance storage costs with data retention and access policies? How is data management planned?
    Participants (11): Bill Allcock, Dee Magnoni, Kyle Lamb, Mark Gary, Sasha Ames,  Jeff Broughton, Julia White, Ashley Barker, Bill Collins, Joel Stevenson. Co-Chairs: Julia White, Bill Allcock
  4. Supporting Data-Producing Facilities and Instruments (Session D1SD, Room 4): What is your center doing to support data and its analysis from light sources, accelerators, satellites, etc.
    Participants (12):  Yao Zhang, Kaki Kelly, Jeff Cunningham, David Smith, Jack Deslippe, Shreyas Cholia, David Skinner, John Harney, Stuart Campbell, Rudy Garcia, Craig Ulmer, Ilana Stern. Co-Chairs: David Skinner, Stuart Campbell.

Day 2, June 19 (All 4 sessions run concurrently)

  1. Infrastructure (Session D2SA, Room 2): What supporting infrastructure is needed to enable data-driven science?  Which of these play primary roles in supporting your center's data-driven science:  networking between resources, shared or local disk, archival storage, science data gateways or portals, consulting, and databases.
    Participants (12):  Venkat Vishwanath, Bill Allcock, Kyle Lamb, Robin Goldstone, Jim Silva, Jay Srinivasan, Jason Hick, Clay England, Doug Fuller, Chris Beggio, Craig Ulmer,  Ilana Stern. Co-Chairs: Robin Goldstone, Chris Beggio.
  2. User Training (Session D2SB. Room 3): What are the methods for effective user training for data-driven science? Visualization. Tools. Algorithms. Workflows. I/O. Who and where are the experts who provide training?  Are they in the technical systems groups, services group, vendors, users?
    Participants (11):  Richard Coffey, Dee Magnoni, Blaise Barney, Tim Fahey, Kjiersten Fagnan, Richard Gerber, Fernanda Foertter, Ashley Barker, Karen Haskell, Bob Ballance, Glenn Lockwood. Co-Chairs: Fernanda Foertter, Tim Fahey
  3. Workflows (Session D2SC, Room 1): What is being used? What works well and what is missing?  What infrastructure and support is required?
    Participants (12):  Kevin Harms, Yao Zhang, Laura Monroe, Kaki Kelly, Sasha Ames, Jeff Long, Shreyas Cholia, Prabhat, Norbert Podhorszki, John Harney, Andy Wilson, Rudy Garcia. Co-Chairs: Shreyas Cholia, Kaki Kelly. 
  4. Data Transfer (Session D2SD, Room 4): What WAN access is in place and what is needed? How do you handle data transfers in/out of your facility today (e.g. do facility staff conduct transfers or do users, what hardware/software do users use)?  What drives the need for networking and what will HPC centers need to do to accommodate that need?
    Participants (10):  Andrew Cherry, Cory Lueninghoener, Curt Canada, Mark Gary, David Smith, Brent Draney, Chris Fuson, Bill Collins, Joel Stevenson, Eli Dart. Co-Chair: Eli Dart, Andrew Cherry.

Generic Breakout Session Questions

Please address these generic questions in addition, or integrated with, the session-specific questions listed above.

    • What are your major strategies and initiatives over the next 5-10 years? How do they affect staffing levels?
    • What are your current efforts and/or site configuration in this area?
    • What are your mandates and constraints?
    • How to do you forecast future needs and requirements?
    • What are the biggest challenges and biggest gaps between what you can do today and what will be required in 5-10 years?
    • What opportunities exist for productive collaborations among DOE HPC centers?
    • Describe some best practices that you think are effective as well as lessons learned that would be helpful to other centers?


Wednesday, June 18

 8:00  Registration & Refreshments    
 8:20  Welcome  Sudip Dosanjh, NERSC Director Room 2
 8:30  The View from DOE  Barbara Helland, DOE ASCR Facilities Division Director Room 2
 9:00 Data Ingest and Export from DOE HPC Facilities: What is Reasonable and Expected Eli Dart, ESnet Room 2
 9:30  An Elephant Sat on my HPC Cluster! Robin Goldstone, LLNL Room 2
 10:00  Schedule and Charge to Breakout Sessions  Richard Gerber, NERSC Room 2
10:10 BREAK
 10:30  A.M. Breakout Sessions

D1SA - Room 2
D1SB - Room 3
D1SC - Room 1
 D1SD - Room 4 

 12:00  Lunch on your own    
 1:30  P.M. Breakout Sessions D1SA - Room 2
D1SB - Room 3
D1SC - Room 1
D1SD - Room 4
 2:30 BREAK    
 3:00  Report from Breakout Sessions Room 2
 5:00  Adjourn    
 6:00  NERSC Machine Room Tour    

Thursday, June 19

 8:00  Registration & Refreshments    
 8:30 Q & A Laura Biven, Senior Science and Technology Advisor, Office of the Deputy Director for Science Programs, DOE Office of Science Room 2
 9:00  A.M. Breakout Sessions D2SA - Room 2
D2SB - Room 3
D2SC - Room 1
D2SD - Room 4
 10:00  BREAK    
 10:30  A.M. Breakout Sessions Continue D2SA - Room 2
D2SB - Room 3
D2SC - Room 1
 D2SD - Room 4 
 12:00  Lunch on your own    
 1:30  Report from Breakout Sessions   Room 2
 3:30 BREAK    
 3:45 Meeting Wrapup and Report Instructions   Room 2
 4:00  Adjourn    


Organizing Committee

  • Ashley Barker, Oak Ridge Leadership Computing Facility
  • Rob Cunningham, Los Alamos National Laboratory
  • Kjiersten Fagnan, NERSC
  • Fernanda Foertter, Oak Ridge Leadership Computing Facility
  • Richard Gerber, NERSC
  • Kevin Harms, Argonne Leadership Computing Facility
  • Jerry Shoopman, Lawrence Livermore National Laboratory
  • Karen Haskell, Sandia National Laboratory

Accomodations and Logistics

The meeting will be located in downtown Oakland, CA, approximately 10 blocks from NERSC's Oakland Scientific Facility.

Accommodations (Please make your arrangements ASAP!)

There is no registration fee for the HPCOR meeting.

Confirmed Attendees

DOE Office of Advanced Scientific Computing Research

Name Notes
Laura Biven Senior Science and Technology Advisor
Office of the Deputy Director for Science Programs
DOE Office of Science
Barbara Helland DOE ASCR Facilities Division Director Floater
Lucy Nowell DOE ASCR Data and Visualization Floater
David Goodwin DOE ASCR, NERSC Program Manager Floater

Argonne Leadership Computing Facility

NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Kevin Harms D1SB D2SC  
Susan Coghlan     Floater
Andrew Cherry D1SA D2SD  
Bill Allcock D1SC D2SA  
Richard Coffey   D2SB  
Venkat Vishwanath   D2SA  
Michael Papka     Floater
Yao Zhang D1SD D2SC  

Los Alamos Advanced Simulation and Computing

NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Cory Lueninghoener D1SA  D2SD  
Kyle Lamb D1SC  D2SA  
Laura Monroe D1SB  D2SC  
Curt Canada D1SA  D2SD  
Bob Kares D1SB    
Kaki Kelly  D1SD  D2SC  
Dee Magnoni D1SC D2SB   

 Lawrence Livermore Advanced Simulation and Computing

NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Mark Gary D1SC  D2SD  
Robin Goldstone D1SA  D2SA  
Jeff Cunningham D1SD    
David Smith D1SD D2SD  
Sasha Ames D1SC  D2SC  
Ming Jiang D1SB    
Blaise Barney   D2SB  
Jeff Long D1SB  D2SC  
Tim Fahey    D2SB  
Jim Silva D1SA  D2SA  


NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Sudip Dosanjh  Floater  Floater  
Richard Gerber   D2SB  
Kjiersten Fagnan   D2SB  
David Skinner D1SD    
Jason Hick D1SA D2SA  
Nick Wright  D1SA    
Chris Daly      
Wes Bethel  D1SB    
Jeff Broughton D1SC    
Jay Srinivasan   D2SA  
Shreyas Cholia D1SD D2SC  
Katie Antypas     Floater
Prabhat D1SB D2SC  
Brent Draney   D2SD  
Jack Deslippe D1SD    

Oak Ridge Leadership Computing Facility

NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Ashley Barker  D1SC  D2SB  
Fernanda Foertter    D2SB  
Chris Fuson    D2SD  
Norbert Podhorszki    D2SC  
Doug Fuller  D1SB  D2SA  
John Harney  D1SD  D2SC  
Clay England  D1SA  D2SA  
Stuart Campbell D1SD    
Julia White D1SC    

Sandia Advanced Simulation and Computing

NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Karen Haskell   D2SB  
Bill Collins  D1SC D2SD  
Joel Stevenson  D1SC D2SD  
Bob Ballance  D1SA D2SB  
Chris Beggio D1SA D2SA  
David Karelitz D1SB    
Andy Wilson D1SB D2SC  
Rudy Garcia D1SD  D2SC  
Craig Ulmer D1SD  D2SA  
Dino Pavlakos  Floater  Floater  
John Noe  Floater  Floater  

Energy Sciences Network

NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Eli Dart  D1SA D2SD  

San Diego Supercomputing Center

NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Glenn Lockwood D1SA D2SB  

National Center for Atmospheric Research

NameDay 1: Wednesday, June 18Day 2: Thursday, June 19Notes
Ilana Stern  D1SD D2SA  


  • HPCOR-Data-2014.pdf | Adobe Acrobat PDF file
    DOE High Performance Computing Operational Review: Enabling Data-Driven Discovery at DOE High Performance Computing Facilities