NERSCPowering Scientific Discovery Since 1974

Big Data Center

Mission

The Big Data Center (BDC) within the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (LBNL) is focused on developing a production-level big data software stack that can be used to solve leading scientific challenges at the full scale of NERSC’s largest supercomputer, Cori. The BDC will bring together existing open source big data analytics and scientific data management software into a single software distribution. Researchers will consider all levels of the stack: starting with real capability science applications, algorithms, key computational and structural motifs, runtimes and optimized libraries.

The BDC software distribution will fill gaps in component packages’ performance and functionality in order to support running exemplar scientific applications that involve the processing of ~100TB datasets on ~100,000 cores on Cori, with a few applications targeting processing 1PB datasets at the full scale of the system. Developing, testing, and packaging the BDC software distribution occurs both at NERSC as well as at collaborating institutions, such as the University of California - Berkeley, Oxford University, the University of Montreal, and the HDF Group. The BDC will operate for 3 years, producing an improved software distribution at the end of each year, and will reach final production-ready status at the end of the center’s lifetime.

For those interested in more information Big Data Center activities, please reach out to Prabhat (prabhat@lbl.gov), Quincey (koziol@lbl.gov), or Karthik (kkashinath@lbl.gov)

Leadership Team

Prabhat, NERSC

Quincey Koziol, NERSC

Victor Lee, Intel

Mike Ringenburg and Ted Slater, Cray

Research Staff

NERSC: Karthik Kashinath, Deborah Bard, Wahid Bhimji, Lisa GerhardtThorsten KurthJialin Liu

Intel: Nalini Kumar, Amrita Mathuriya, Lei Shao

Cray: Kristyn Maschhoff, Peter Mendrygal, Aaron Vose

IPCCs (Intel Parallel Computing Centers)

Frank Wood, Gunes Baydin (University of Oxford)

Jeffrey Regier, Jon McAuliffe (University of California, Berkeley)

Adam Rupe, Ryan James, Jim Crutchfield (University of California, Davis)

Nick Choma, Joan Bruna (New York University)

Grzegorz Muszynski, Vitaliy Kurlin (University of Liverpool)

Upcoming Events 

Past Events and Publications

Big Data Summit 2018

Ongoing Projects

ClimateNet

Software

TensorFlow

Caffe

HDF5