NERSCPowering Scientific Discovery Since 1974

Joint Facilities User Forum on Data-Intensive Computing

All logos

June 16-18, 2014

Oakland City Center Conference Center
500 12th Street, Suite 105
Oakland, CA

Directions and Site Brochure

Held in conjunction with DOE HPC Operational Review (HPCOR) June 17-19, 2014

The Joint Facilities User Forum on Data-Intensive Computing will bring together users and HPC center staff to discuss successes, failures, lessons learned, and the future of data-driven scientific discovery. There will also be a day of practical user training.

The meeting is being organized by NERSC at Lawrence Berkeley National Lab (LBNL), the DOE Leadership Computational Facilities at Argonne (ALCF) and Oak Ridge National Labs (OLCF), Sandia National Lab, Lawrence Livermore National Lab (LLNL), and Los Alamos National Lab (LANL). Other participating facilities include Pacific Northwest National Laborators (PNNL), the University Corporation for Astmospheric Research and the National Center for Atmospheric Research (NCAR), the Energy Sciences Network (ESnet), and the San Diego Supercomputer Center (SDSC).  

The meeting is organized into three days:

Monday, June 16: Advances in Managing, Analyzing, and Visualizing Data
Attendees will have an opportunity to learn about the latest research, techniques, and software related to data management and analysis. Researchers from the national labs, universities, and industry will share the latest efforts and ideas.
Tuesday, June 17: Successes, Failures, Best Practices, and Lessons Learned (Joint with HPCOR)
Scientists who have implemented data-intensive workflows will talk about what worked, what didn't work, and what is needed to support their science into the future.
Wednesday, June 18: Working With Data: Practical Ways to Get Things Done 
HPC Center staff will provide a day of training aimed at the user who wants to know how to best deal with data today: I/O best practices, moving and transferring data, using visualization tools, etc.

The DOE HPC Operational Review is a by-invitation only event that is joint with this meeting on June 17 and runs separately on June 18-19, 2014.

Monday, June 16

Advances in Managing, Analyzing, and Visualizing Data


Talk ID

8:15   Registration and Refreshments  
    Session Chair Richard Gerber, NERSC
8:50   Welcome Richard Gerber, NERSC Senior Advisor
9:00 M01 Welcome and Meeting Overview Laura Biven, Senior Science and Technology Advisor, DOE Office of the Deputy Director for Science Programs
9:30 M02 The Future of Data and Scientific Computing Workflows  Mike Wilde, ANL
10:00  M03 The Future of Large Scale Visual Data Analysis  Wes Bethel, LBNL
10:30   BREAK  
11:00 M04 Machine learning for data-driven discovery  Sreenivas Sukumar, ORNL
11:30 M05 In situ Visualization with the Sierra Simulation Framework Using ParaView Catalyst  Tom Otahal, Sandia National Lab

Lunch: on your own, or join a topical lunch group:

Scientific Computing Workflows
Large Scale Visualization
Data Storage and Data Movement
Structuring Data for High Performance I/O

    Session Chair Fernanda Foertter, OLCF
1:30 M06 ESNet plans for supporting WAN data movement  Eli Dart, ESnet
2:00 M07  The HPC data center of the future: What hardware and software technologies are on the horizon?  Jason Hick NERSC
2:30 M08 Using databases for analysis of scientific data (e.g. SciDB)   Yushu Yao, NERSC
3:00   BREAK  
3:30 M09 Globus for Data Management Rachana Ananthakrishnan, ANL
4:00 M10 Panel: 20 Minutes Into Our Future: Near-term technology panel discussion between facility operations, applications developer, and users  Bill Allcock, ANL; Kerstin Kleese van Dam, PNNL; Jason Hick, NERSC; Dula Parkinson, Beamline Scientist, ALS/LBNL
5:00   Adjourn  


  Tour of the NERSC facility  Jeff Broughton


Tuesday, June 17

Successes, Failures, Best Practices, and Lessons Learned

TimeTalk IDTopicSpeaker


  Registration and Refreshments  


  Welcome  Katie Antypas, NERSC Services Department Head


  Session Chair Richard Coffey, ANL


T01 Best Practices for Scientific Data Management  Kerstin Kleese-Van Dam, PNNL


T02 Design Patterns in Web Gateways for Scientific Data  Shreyas Cholia, NERSC



Data Management for Climate Science: Lessons Learned and Future Needs

 Ilana Stern, NCAR



Open Panel Discussion

Kerstin Kleese-Van Dam, Shreyas Cholia,  Ilana Stern




  Session Chair David Skinner, NERSC


T04 Changing the Way Light Sources Manage Data: Challenges, Lessons Learned, and Future Directions Jack Deslippe, NERSC


T05 The Materials Project: computing and sharing a searchable database of materials properties using the Fireworks job management system Anubhav Jain, LBNL


T06 Accelerating Scientific Discovery at the Spallation Neutron Source Stuart Campbell, ORNL SNS
11:40   Open Panel Discussion Jack Deslippe, Anubhav Jain, Stuart Campbell


  Lunch: on your own  


  Session Chair Ashley Barker, OLCF


T07 Tactical High Throughput Computing at NERSC and Beyond: The qdo workflow tool LDRD at Berkeley Lab Stephen Bailey, LBNL


T08 ACME - Accelerated Climate Modeling for Energy John Harney, OLCF


T09 Lessons Learned and Future Needs for JGI and NERSC Kjiersten Fagnan, NERSC/JGI


  Open Panel Discussion Stephen Bailey, John Harney, Kjiersten Fagnan




  Session Chair Karen Haskell, Sandia


T10 Big Data at San Diego Supercomputing Center Glenn Lockwood, SDSC


T11 Trillion Particles, 120,000 cores, and 350 TBs: Lessons Learned from a Hero I/O Run  Prabhat, LBNL


T12 Experiences with Data Parallel Frameworks and plans for the future Sreenivas R. Sukumar, ORNL


T13 Data Needs for LCLS-II Amedeo Perazzo, SLAC


  Open Panel Discussions  




  Tour of the NERSC Facility  Jeff Broughton, Dave Paul

Wednesday, June 18

Working With Data - Practical Ways to Get Things Done


Talk IDTopicSpeaker


   Registration and Refreshments  


  Session Chair Jini Ramprakash, ANL


W01  Parallel File Systems at HPC Centers: Usage, Experiences, and Recommendations  Richard Hedges, LLNL; Bill Allcock, ALCF; David Turner, NERSC


W02  Best practices for performing large-scale I/O  Venkat Vishwanath, ANL




W03  Transferring large data sets over the WAN. (Includes using data transfer nodes and tools at the national labs)  Chris Fuson, ORNL; Bill Collins, Sandia; Eli Dart, ESnet


W04  Using Globus for Data Transfer and Sharing Rachana Ananthakrishnan, ANL


   Lunch on your own  


  Session Chair Blaise Barney, LLNL


W05  Archival storage and best practices for using the HPSS archival storage system  Nick Balthaser, NERSC and Lisa Gerhardt, NERSC


W06  Using visualization tools - VIsIt, ParaView, EnSight  Sean Ahern, ORNL; Ken Moreland, Sandia; Bob Kares, LANL


W07  How to share data with your collaborators and the public (including portals)  Shreyas Cholia, NERSC




W08 I/O libraries and APIs: HDF5, netCDF, MPI I/O, POSIX - which to use?  Rob Latham, ANL and Quincey Koziol, The HDF Group


W09 Using ADIOS  Norbert Podhorszki, ORNL




  Tour of the NERSC Facility  Jack Deslippe


ALCF - Argonne Leadership Computing Facility
ALS - Advanced Light Source
ANL - Argonne National Laboratory
JGI - Joint Genome Institute
LANL - Los Alamos National Laboratory
LBNL - Lawrence Berkeley National Laboratory
LCLS - Linac Coherent Light Source
LLNL - Lawrence Livermore National Laboratory
NERSC - National Energy Research Scientific Computing Center
OLCF - Oak Ridge Leadership Computing Facility
ORNL - Oak Ridge National Laboratory
PNNL - Paciific Northwest National Laboratory
SLAC - SLAC National Accerator Laboratory
SNL - Sandia National Laboratory
SNS - Spallation Neutron Source