NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory

NERSC Data Transfer Nodes

The data transfer nodes are NERSC servers dedicated to performing transfers between NERSC data storage resources such as HPSS and the NERSC Global Filesystem (NGF), and the storage resources of other sites including the Leadership Computing Facility at ORNL (Oak Ridge National Laboratory). These nodes are being managed (and monitored for performance) as part of a collaborative effort between ESnet, NERSC, and ORNL to enable high performance data movement over the high-bandwidth 10Gb ESnet wide-area network (WAN).

Configuration

There are two data transfer nodes deployed at NERSC. Each node includes two dual-core AMD Opteron processors operating at 3.0 GHz, 8 GB of shared memory, two fibre-channel interfaces (4 GB/s each), and two Ethernet interfaces (10 Gb/s each). The nodes are running CentOS 5.2, a RedHat Enterprise Linux derivative. Third-party software is managed via modules. As with other NERSC systems, shell configuration files ("dot files") are under the control of NERSC; users should only modify ".ext" files.

The NERSC nodes are named dtn01.nersc.gov and dtn02.nersc.gov. ORNL has deployed similar data transfer nodes named dtn01.ccs.ornl.gov and dtn02.ccs.ornl.gov.

Accounts

The data transfer nodes are available for use by all NERSC users; you do not have to request an account if you are already a NERSC user. The nodes support both interactive use via SSH (direct login) or data transfer using GridFTP services (3rd party transfers).

File Storage

The NERSC data transfer nodes utillize two primary file systems: /global/homes, and /project/projectdirs. Both file systems are components of NGF. Note that there is no /scratch file system, and /tmp is very small. Althought certain common tools (e.g., vi) will use /tmp for temporary storage, users should never explicitly use /tmp for data.

Recommended File Transfer Software

The following transfer software is available on the data transfer nodes. See the Examples below for recommended tuning options.

scp

Secure copy (scp) provides encrypted data copy capability. It is easy to use and provides good status information during the transfer, but does not provide any transfer parallelism. It does not provide tuning options for WAN transfers. Because of these issues, it is most suitable for transferring a small number of relatively small files.

When transferring data between NERSC and ORNL, you will get your best performance using the data transfer nodes at each site. However, due to firewall rules at ORNL, such transfers must be initiated from the ORNL side.

Within NERSC, scp will typically achive transfer rates of about 30 MB/s; between NERSC and ORNL, this value drops to around 5 MB/s.

bbcp

Babar cp (bbcp) is a parallel transfer tool with easy (encrypted) authentication via SSH. It provides for parallel (unencrypted) data streams, and also offers many options for performance tuning. It is a "peer-to-peer" model; when you execute the command on a local machine, it must start a copy on the remote machine. This can make the command line rather messy.

Within NERSC, bbcp will typically achieve transfer rates of about 200 MB/s; similar rates can be achieved between NERSC and ORNL.

globus-url-copy

Globus-url-copy is a GridFTP client that provides good transfer performance between NERSC filesystems, NERSC HPSS, and ORNL. It requires tuning options to achieve best performance. It supports third-party transfers and is easy to script; this is the client of choice for grid-based workflows. As a Grid tool, it uses ceritificates for authentication. The command line can be rather messy.

Within NERSC, globus-url-copy will typically achieve transfer rates of about 200 MB/s; similar rates can be achieved between NERSC and ORNL.

hsi

Hierarchical Storage Interface (hsi) provides the best transfer rates to or from NERSC HPSS and is pretuned for optimal performance. Authentication is via HPSS tokens.

On the data transfer nodes, hsi can achieve transfer rates of 400 MB/s between /project and NERSC HPSS. Due to current limitations in hsi, performance between NERSC and ORNL HPSS is limited to about 10 MB/s.

Summary of file transfer software
Client NameNotesNERSC (LAN)ORNL (WAN)
scpeasy to authenticate; best transfer status30 MB/sec5 MB/sec
bbcpeasy to authenticate; parallel transfers; tuning options200 MB/sec200 MB/sec
globus-url-copybest for scripting; parallel transfers; tuning options200 MB/sec200 MB/sec
hsibest HPSS transfers; parallel transfers400 MB/sec10 MB/sec

Examples of Recommended Options

Between NERSC resources

/project and HPSS

% globus-url-copy -p 4 -tcp-bs 4MB gsiftp://dtn01.nersc.gov/project/projectdirs/bigsci/file_from_project.tar gsiftp://garchive.nersc.gov/home/e/elvis/file_to_hpss.tar

/project and /scratch

% bbcp -T "ssh -x -a -oFallBackToRsh=no %I -l %U %H /usr/common/usg/bin/bbcp" /project/projectdirs/bigsci/file_from_project.tar franklingrid.nersc.gov:/scratch/scratchdirs/elvis/file_to_scratch.tar

% globus-url-copy -p 4 -tcp-bs 4MB gsiftp://dtn01.nersc.gov/project/projectdirs/bigsci/file_from_project.tar gsiftp://franklingrid.nersc.gov/scratch/scratchdirs/elvis/file_to_scratch.tar

Between NERSC and ORNL

Globus-url-copy provides good parallel transfer performance between the NERSC and ORNL data transfer nodes. Each data transfer node is capable of handling a distinct grid transfer such that users may stripe across available network bandwidth of both data transfer nodes at each site.

% globus-url-copy -p 4 -tcp-bs 12MB gsiftp://dtn01.nersc.gov/project/projectdirs/bigsci/file_from_project.tar gsiftp://dtn01.ccs.ornl.gov/lustre/wolf-ddn/scratch/elvis/file_to_global_scratch.tar

The following command is executed on a system at ORNL:

% bbcp -w 100M -T "ssh -x -a -oFallBackToRsh=no %I -l %U %H /usr/common/usg/bin/bbcp" /lustre/wolf-ddn/scratch/elvis/file_from_global_scratch.tar franklingrid.nersc.gov:/scratch/scratchdirs/elvis/file_to_scratch.tar

Current Transfer Rates Between NERSC and ORNL

The graphs shown here show the results of regular transfer tests (four per day) between NERSC and ORNL. They indicate performance you should expect to get when transferring between NERSC /project and ORNL's global file system using the data transfer nodes.

Help

For assistance with the data transfer nodes, please contact NERSC Consulting.


LBNL Home
Page last modified: Mon, 11 Jan 2010 21:46:03 GMT
Page URL: http://www.nersc.gov/nusers/systems/datatran/
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science