Data Intensive Computing Pilot Program
In 2014 NERSC is conducting its second and last round of allocations to projects in data intensive science. This pilot aims to support and enable scientists to tackle their most demanding data intensive challenges. Selected projects will be piloting new methods and technologies targeting data management, analysis, and dissemination. Projects should demonstrate one or more of these characteristics:
- Novel uses of the HPSS archival storage in data analysis workflows, techniques that leverage tape and disk.
- Projects with high-throughput computing challenges.
- Projects that generate data products that are then shared with a larger community through science gateways.
- Projects that propose new ways to synthesize multiple data sets or observational and simulation data to glean new insights.
- Software development that enhances data management strategies available to the NERSC user community and which are listed at Data Management Strategies and Policies.
Successful applicants have access to one or more of the following resources:
- Up to 1 PB of Storage ( up to 200 TB of Project Disk Storage and 800 TB of Archival Storage)
- Up to 10 million core hours on the Carver InfiniBand Cluster (15 million NERSC MPP hours)
- Priority access to a Hadoop cluster
- Reserved access to dedicated data transfer resources that are well connect to the 100Gb network
- Priority access to a 1 TB memory node on Carver
- Science Gateways for providing access to data and results through the web
- Access to Relational and Schema-less databases
If applicants need assistance with tuning their local network infrastructure to more effectively move data in and out of NERSC, ESnet maintains a comprehensive knowledge base (http://fasterdata.es.net/) that includes user-friendly resources for improving data transfer performance.
The deadline to apply was December 10, 2013.
The pace of data-driven scientific discovery is growing rapidly in genomics, astronomy, nuclear physics, and many other areas of science. New instruments and detectors are generating data at staggering rates and scientists are struggling to marshal sufficient resources to analyze the output. This new NERSC effort is aimed at addressing this situation.
While all applications will be considered, NERSC is particularly interested in supporting experimentally driven projects that are generating data at a rate that outpaces their ability to analyze it. Applicants should describe how they will make use of these unique resources and how access to these capabilities will advance their scientific progress.
NERSC will assign a staff member to support each approved project. This staff member will assist in porting applications to the NERSC environment, provide expert advice, and act as an advocate for the project within NERSC.
Successful applicants will be required to provide quarterly reports detailing their progress and any scientific advancements made as a result of the award. An underlying objective of this pilot program is to demonstrate the level of impact and scientific breakthroughs that are made possible when scientists have access to these types of innovative systems and services.