Advice and Overview
NERSC provides many facilities for storing data and performing analysis. However, transfering data - whether over the wide area network or with NERSC - can be expensive and time consuming. This page explains the mechanisms NERSC provides to move your data from one place to another. A good strategy, once your data is resident at NERSC, is to perform your analysis in situ, rather than transferring the data elsewhere for analysis. The NERSC consultant can help you formulate plans for efficient data management.
Data can be transferred to and from NERSC using Globus Online, gridftp, scp, sftp, bbcp, and HPSS tools. NERSC also provides an easy way for research teams to share data through via the web from their project directories.
To facilitation WAN data transfers, NERSC provides dedicated Data Transfer Nodes, which are optimized for bandwidth and have direct access to most of the NERSC file systems. File transfer bandwidth is also optimized for transferring files between ORNL and NERSC.
NERSC File Systems
NERSC has a number of shared file systems that are available from all computers: Project, Home, and Global Scratch. These file systems are ideal for sharing data among different platforms. In addition, Hopper and Edison have large, high-speed local scratch file systems. Please refer to NERSC file systems for details.
Data Transfer Nodes
The Data Transfer Nodes (DTN) are servers dedicated to data transfer. DTNs have access to most of the NERSC file systems, and are tuned to transfer data efficiently. The Data Transfer Nodes are also tuned for transferring large data files between NERSC and Oak Ridge or Argonne National Laboratories.
External Data Transfer
There are a number of ways to transfer data to and from NERSC.
- SCP/SFTP: for smaller files (<1GB).
- Globus Online: for large files, with extra features for auto-tuning and auto-fault recovery without a client install
- BaBar Copy (bbcp): for large files
- GridFTP: for large files
- HSI: can be an efficient way to transfer files already in the HPSS system
Transferring Data Within NERSC
- Do you need to transfer at all? If your data is on NERSC Global File Systems, it's available at high performance center-wide. No data transfer is necessary if files are in project, global homes or global scratch because these file systems are mounted on almost all NERSC systems.
- Use the the unix command "cp" to copy files within the same computation system.
- To transfer files between computational systems (e.g. Edison local scratch to Hopper local scratch), use SCP/SFTP to transfer smaller files (<10GB), and BaBar Copy (bbcp) or GridFTP for bigger files.
- HPSS can also be used as an intermmediary to transfer files within NERSC. For example you can upload data to HPSS from Hopper $SCRATCH and retrieve it from HPSS on Edison $GSCRATCH. For details about HPSS data transfer, see Storing and Retrieving HPSS Data.