NERSCPowering Scientific Discovery Since 1974

Joint Facilities User Forum on Data-Intensive Computing

All logos

June 16-18, 2014

Oakland City Center Conference Center
500 12th Street, Suite 105
Oakland, CA

Directions and Site Brochure

Held in conjunction with DOE HPC Operational Review (HPCOR) June 17-19, 2014

The Joint Facilities User Forum on Data-Intensive Computing will bring together users and HPC center staff to discuss successes, failures, lessons learned, and the future of data-driven scientific discovery. There will also be a day of practical user training.

The meeting is being organized by NERSC at Lawrence Berkeley National Lab (LBNL), the DOE Leadership Computational Facilities at Argonne (ALCF) and Oak Ridge National Labs (OLCF), Sandia National Lab, Lawrence Livermore National Lab (LLNL), and Los Alamos National Lab (LANL). Other participating facilities include Pacific Northwest National Laborators (PNNL), the University Corporation for Astmospheric Research and the National Center for Atmospheric Research (NCAR), the Energy Sciences Network (ESnet), and the San Diego Supercomputer Center (SDSC).  

The meeting is organized into three days:

Monday, June 16: Advances in Managing, Analyzing, and Visualizing Data
Attendees will have an opportunity to learn about the latest research, techniques, and software related to data management and analysis. Researchers from the national labs, universities, and industry will share the latest efforts and ideas.
Tuesday, June 17: Successes, Failures, Best Practices, and Lessons Learned (Joint with HPCOR)
Scientists who have implemented data-intensive workflows will talk about what worked, what didn't work, and what is needed to support their science into the future.
Wednesday, June 18: Working With Data: Practical Ways to Get Things Done 
HPC Center staff will provide a day of training aimed at the user who wants to know how to best deal with data today: I/O best practices, moving and transferring data, using visualization tools, etc.

The DOE HPC Operational Review is a by-invitation only event that is joint with this meeting on June 17 and runs separately on June 18-19, 2014.

Monday, June 16

Advances in Managing, Analyzing, and Visualizing Data

Time

Talk ID

TopicSpeaker
8:15   Registration and Refreshments  
    Session Chair Richard Gerber, NERSC
8:50   Welcome Richard Gerber, NERSC Senior Advisor
9:00 M01 Welcome and Meeting Overview Laura Biven, Senior Science and Technology Advisor, DOE Office of the Deputy Director for Science Programs
9:30 M02 The Future of Data and Scientific Computing Workflows  Mike Wilde, ANL
10:00  M03 The Future of Large Scale Visual Data Analysis  Wes Bethel, LBNL
10:30   BREAK  
11:00 M04 Machine learning for data-driven discovery  Sreenivas Sukumar, ORNL
11:30 M05 In situ Visualization with the Sierra Simulation Framework Using ParaView Catalyst  Tom Otahal, Sandia National Lab
12:00  

Lunch: on your own, or join a topical lunch group:

Scientific Computing Workflows
Large Scale Visualization
Data Storage and Data Movement
Structuring Data for High Performance I/O

 
    Session Chair Fernanda Foertter, OLCF
1:30 M06 ESNet plans for supporting WAN data movement  Eli Dart, ESnet
2:00 M07  The HPC data center of the future: What hardware and software technologies are on the horizon?  Jason Hick NERSC
2:30 M08 Using databases for analysis of scientific data (e.g. SciDB)   Yushu Yao, NERSC
3:00   BREAK  
3:30 M09 Globus for Data Management Rachana Ananthakrishnan, ANL
4:00 M10 Panel: 20 Minutes Into Our Future: Near-term technology panel discussion between facility operations, applications developer, and users  Bill Allcock, ANL; Kerstin Kleese van Dam, PNNL; Jason Hick, NERSC; Dula Parkinson, Beamline Scientist, ALS/LBNL
5:00   Adjourn  

5:00-6:00

  Tour of the NERSC facility  Jeff Broughton

 

Tuesday, June 17

Successes, Failures, Best Practices, and Lessons Learned

TimeTalk IDTopicSpeaker

8:15

  Registration and Refreshments  

8:50

  Welcome  Katie Antypas, NERSC Services Department Head

 

  Session Chair Richard Coffey, ANL

9:00

T01 Best Practices for Scientific Data Management  Kerstin Kleese-Van Dam, PNNL

9:20

T02 Design Patterns in Web Gateways for Scientific Data  Shreyas Cholia, NERSC

9:40

T03

Data Management for Climate Science: Lessons Learned and Future Needs

 Ilana Stern, NCAR

10:00

 

Open Panel Discussion

Kerstin Kleese-Van Dam, Shreyas Cholia,  Ilana Stern

10:25

  BREAK  

 

  Session Chair David Skinner, NERSC

10:40

T04 Changing the Way Light Sources Manage Data: Challenges, Lessons Learned, and Future Directions Jack Deslippe, NERSC

11:00

T05 The Materials Project: computing and sharing a searchable database of materials properties using the Fireworks job management system Anubhav Jain, LBNL

11:20

T06 Accelerating Scientific Discovery at the Spallation Neutron Source Stuart Campbell, ORNL SNS
11:40   Open Panel Discussion Jack Deslippe, Anubhav Jain, Stuart Campbell

12:00

  Lunch: on your own  

 

  Session Chair Ashley Barker, OLCF

1:30

T07 Tactical High Throughput Computing at NERSC and Beyond: The qdo workflow tool LDRD at Berkeley Lab Stephen Bailey, LBNL

1:50

T08 ACME - Accelerated Climate Modeling for Energy John Harney, OLCF

2:10

T09 Lessons Learned and Future Needs for JGI and NERSC Kjiersten Fagnan, NERSC/JGI

2:30

  Open Panel Discussion Stephen Bailey, John Harney, Kjiersten Fagnan

2:50

  BREAK  

 

  Session Chair Karen Haskell, Sandia

3:10

T10 Big Data at San Diego Supercomputing Center Glenn Lockwood, SDSC

3:30

T11 Trillion Particles, 120,000 cores, and 350 TBs: Lessons Learned from a Hero I/O Run  Prabhat, LBNL

3:50

T12 Experiences with Data Parallel Frameworks and plans for the future Sreenivas R. Sukumar, ORNL

4:10

T13 Data Needs for LCLS-II Amedeo Perazzo, SLAC

4:30

  Open Panel Discussions  

5:00

  Adjourn  

5:00-6:00

  Tour of the NERSC Facility  Jeff Broughton, Dave Paul

Wednesday, June 18

Working With Data - Practical Ways to Get Things Done

Time

Talk IDTopicSpeaker

8:15

   Registration and Refreshments  

 

  Session Chair Jini Ramprakash, ANL

9:00

W01  Parallel File Systems at HPC Centers: Usage, Experiences, and Recommendations  Richard Hedges, LLNL; Bill Allcock, ALCF; David Turner, NERSC

10:00

W02  Best practices for performing large-scale I/O  Venkat Vishwanath, ANL

10:30

   BREAK  

10:45

W03  Transferring large data sets over the WAN. (Includes using data transfer nodes and tools at the national labs)  Chris Fuson, ORNL; Bill Collins, Sandia; Eli Dart, ESnet

11:30

W04  Using Globus for Data Transfer and Sharing Rachana Ananthakrishnan, ANL

12:00

   Lunch on your own  

 

  Session Chair Blaise Barney, LLNL

1:30

W05  Archival storage and best practices for using the HPSS archival storage system  Nick Balthaser, NERSC and Lisa Gerhardt, NERSC

2:00

W06  Using visualization tools - VIsIt, ParaView, EnSight  Sean Ahern, ORNL; Ken Moreland, Sandia; Bob Kares, LANL

3:00

W07  How to share data with your collaborators and the public (including portals)  Shreyas Cholia, NERSC

3:30

   BREAK  

3:45

W08 I/O libraries and APIs: HDF5, netCDF, MPI I/O, POSIX - which to use?  Rob Latham, ANL and Quincey Koziol, The HDF Group

4:30

W09 Using ADIOS  Norbert Podhorszki, ORNL

5:00

  Adjourn  

5:00-6:00

  Tour of the NERSC Facility  Jack Deslippe

Affiliations

ALCF - Argonne Leadership Computing Facility
ALS - Advanced Light Source
ANL - Argonne National Laboratory
JGI - Joint Genome Institute
LANL - Los Alamos National Laboratory
LBNL - Lawrence Berkeley National Laboratory
LCLS - Linac Coherent Light Source
LLNL - Lawrence Livermore National Laboratory
NERSC - National Energy Research Scientific Computing Center
OLCF - Oak Ridge Leadership Computing Facility
ORNL - Oak Ridge National Laboratory
PNNL - Paciific Northwest National Laboratory
SLAC - SLAC National Accerator Laboratory
SNL - Sandia National Laboratory
SNS - Spallation Neutron Source