Storage & File Systems
We recognize the importance of being able to manage your files and data. Scientific datasets are growing very rapidly and there is an increasing need for large scale data storage and high performance data transfers. As science becomes more distributed organizing and sharing data becomes very important.
This section covers several topics on how to successfully manage your data at NERSC.
- NERSC Data Management Policies This page provides some of the information that Principal Investigators can use when writing the Data Management section of their research proposals. NERSC provides its users with the means to store, manage and share their research data products. We provide a variety of storage resources optimized for different phases of the data lifecycle, tools to enable users to manage, protect and control their data; high-speed networks for intra-site and inter-site (ESnet) data transfer; gateways and portals for publishing data for broad consumption; and consulting services to help users craft efficient data management processes for their projects.
- I/O Formats I/O continues to be one of the main bottlenecks for scientific applications. This page describes the HDF5 and NetCDF software.
- NERSC Data Storage Resources This page compares the various file systems at NERSC in terms of availability per machine, purging, quota limits, and other key characteristics.
- Data Transfer Nodes The data transfer nodes are NERSC servers dedicated to performing transfers between NERSC data storage resources such as HPSS and the NERSC Global Filesystem (NGF), and storage resources at other sites.
- HPSS Data Archive HPSS, the High Performance Storage System, is the NERSC system you should use to back up your files to prevent data loss from accidental deletion and file purging.
- I/O Resources for Scientific Applications at NERSC NERSC provides a range of online resources to assist users developing, deploying, understanding, and tuning their scientific I/O workloads, supplemented by direct support from the NERSC Consultants and the Data Analytics Group. Here, we provide a consolidated summary of these resources, along with pointers to relevant online documentation.
- Using the Burst Buffer, a layer of SSD storage within Cori
- Science Database Services NERSC supports the provisioning of databases to hold large scientific datasets. Currently we support MySQL, PostgreSQL, MongoDB and SciDB (Experimental)
- Sharing Data Information on how to share data across NERSC systems, with other users within NERSC, or between NERSC and systems elsewhere.
- Transferring Data Data can be transferred to and from NERSC using Globus Online, gridftp, scp, sftp, bbcp, and HPSS tools. NERSC also provides an easy was for research teams to share data through via the web from their project directories.
- Unix File Permissions and Groups Overview of Unix file permissions