Data Transfer Nodes
The data transfer nodes are NERSC servers dedicated to performing transfers between NERSC data storage resources such as HPSS and the NERSC Global Filesystem (NGF), and storage resources at other sites including the Leadership Computing Facility at ORNL (Oak Ridge National Laboratory). These nodes are being managed (and monitored for performance) as part of a collaborative effort between ESnet, NERSC, and ORNL to enable high performance data movement over the high-bandwidth 10Gb ESnet wide-area network (WAN).
In order to keep the data transfer nodes performing optimally for data transfers, we request that users restrict interactive use of these systems to tasks that are related to preparing data for transfer or are directly related to data transfer of some form or fashion. Examples of intended usage would be running python scripts to download data from a remote source, running client software to load data from a file system into a remote database, or compressing (gzip) or bundling (tar) files in preparation for data transfer. Examples of that should not be done include running a database on the server to do data processing, or running tests that saturate the nodes resources. The goal in all this is to maintain data transfer systems that have adequate memory and CPU available for interactive user data transfers.
There are four data transfer nodes deployed at NERSC:
These machines each have four 10-gigabit ethernet links for transfers over the network and two FDR Infinband links for the filesystem. Third-party software is managed via modules. As with other NERSC systems, shell configuration files ("dot files") are under the control of NERSC; users should only modify ".ext" files.
Oak Ridge has deployed similar data transfer nodes named dtn01.ccs.ornl.gov through dtn04.ccs.ornl.gov. See Data Transfer Nodes at Oak Ridge.
All NERSC users are automatically given access to the data transfer nodes. The nodes support both interactive use via SSH (direct login) or data transfer using GridFTP services.
Available File Systems
The NERSC data transfer nodes use global home and the project directories (note that currently scratch is not mounted on the DTNs). See NERSC File Systems. Please note that /tmp is very small. Although certain common tools (e.g., vi) use /tmp for temporary storage, users should never explicitly use /tmp for data.
File Transfer Software
For smaller files you can use Secure Copy (SCP) or Secure FTP (SFTP) to transfer files between two hosts. See Using SCP and SFTP at NERSC. For larger files we recommend using Globus Online. It makes GridFTP transfers trivial so users do not have to learn command line options for manual performance tuning. Globus Online also does automatic performance tuning and has been shown to perform comparably to -- or even better (in some cases) than -- expert-tuned GridFTP. See Grid Data Transfer for details. You can also use the bbcp package to transfer large files, see Using BBCP at NERSC.
Special commands are used for data transfers to or from the HPSS mass storage system. See Storing and Retrieving HPSS Data.