NERSC Launches Data-intensive Science Pilot Program

DOE Researchers Eligible to Apply for Resources, Expertise

April 12, 2012


NERSC's new data-intensive science pilot program is aimed at helping scientists capture, analyze and store the increasing stream of scientific data coming out of experiments, simulations and instruments, such as the Advanced Light Source (domed building in photo) at Berkeley Lab.

Department of Energy’s (DOE) National Energy Research Scientific Computing Center (NERSC) is launching a new initiative to support DOE-relevant, data-intensive science pilot projects for up to 18 months.

“NERSC has long understood the importance of data intensive science and has supported the analysis of data streams from telescopes, detectors, and sequencers in addition to data coming from simulations run at NERSC," said Kathy Yelick, associate laboratory director for Computing Sciences at Berkeley Lab. “DOE has unique data challenges arising from their large experimental facilities.”

While all applications will be considered, NERSC is particularly interested in supporting experiments that are generating data at rates beyond their current analysis capabilities. "Many of these fields are generating data at increasing rates and struggling to marshal sufficient resources to realize the full potential from new instruments and detectors,” said Shane Canon, who heads NERSC’s Technology Integration Group. “This effort is aimed at addressing that gap.”

Those selected for the pilot program will get access to large data stores (up to 1 Petabyte of disk and tape storage), priority access to a 6 Terabyte flash-based file system (with 15 Gigabits per second transfer speeds), and priority access to Hadoop-style computing resources on NERSC’s Carver Infiband cluster (with access to a 1 TB memory node). They may also use NERSC’s Science Gateways for web access. (Full details available in the call for applications.)

“We’ve seen overall science data traffic growing at a rate of 70 percent per year since 1990. We expect the trend to continue, and even accelerate, in coming years,” said Greg Bell, acting director of the DOE’s Energy Sciences Network (ESnet). “The challenge of getting that data from instruments to analysis—or even from scientist to scientist—shouldn’t be underestimated.”

Data transfers to and from NERSC can take advantage of the center’s high-speed connectivity to ESnet. Currently running at speeds of up to 20 Gigabits per second, NERSC to ESnet connectivity will be upgrade to 100Gbps by the end of 2012. In addition to an array of storage, computing and data-transfer resources, awardees will also be assigned a staff member to support and advocate for each project within NERSC.

About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is the primary high-performance computing facility for scientific research sponsored by the U.S. Department of Energy's Office of Science. Located at Lawrence Berkeley National Laboratory, the NERSC Center serves more than 4,000 scientists at national laboratories and universities researching a wide range of problems in combustion, climate modeling, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a U.S. Department of Energy national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. DOE Office of Science. »Learn more about computing sciences at Berkeley Lab.