NERSCPowering Scientific Discovery Since 1974

New Users Guide

HPSS stands for High Performance Storage System and is the general term for software that can be used to store data on robotic tape libraries. At NERSC the primary HPSS system for data storage is called "HPSS User" or just "HPSS" and is accessed at archive.nersc.gov. By default every NERSC user has an account on the HPSS User system. There is also another, smaller HPSS system at NERSC called "HPSS Backup" (or "regent") that can be accessed at hpss.nersc.gov. This is used primarily by NERSC staff for system backups and occasionally by users in special cases.

Accessing HPSS

You can access HPSS from any NERSC system. HPSS uses NIM and the NERSC LDAP server to create an "hpss token" for user authentication. On a NERSC system, typing "hsi" or "htar" will usually be enough to create this token. However, some more complicated use cases may require you to manually generate a token. Please see the HPSS Passwords page for more details.

Files can be archived to HPSS individually with the "hsi" command or in groups with the "htar" command (similar to the way "tar" works). HPSS is also accessible via ftp, pftp, gridFTP, and Globus. Please see the Accessing HPSS page for a list of all possible way to access HPSS and details on their use.

HPSS can also be accessed from non-NERSC systems after installing NERSC supported software. See this page for details.

Best Usage of HPSS

Group Small Files Together

HPSS is optimized for file sizes of 100s of GB. If you need to store many small files please use htar, or bundle them together with tar before storing. Storing many small files in HPSS using hsi or Globus will result in extremely long retrieval times for these files and will slow down the HPSS system for all users.

Please see this section for more details on how to deal with small files and the Htar Usage page for more details on how to use htar.

Very Large Files

Files sizes greater than 1 TB can be difficult for HPSS to work with and lead to longer transfer times, increasing the possibility of transfer interruptions. Generally it's best to aim for file sizes in the 100s of GB range. Please see this section for more details.

Firewalls and Accessing HPSS Remotely

When storing and retrieving files from outside of NERSC, it is not uncommon to encounter problems due to firewalls at the client site. Often you will have to configure your client firewall to allow connections to HPSS. See the HPSS firewall page for more details.

HPSS Usage Charging

In order to provide a balanced computing environment with appropriate amounts of storage and adequate bandwidth to keep the compute engines fed with data, HPSS usage is tracked using Storage Resource Units (SRUs). SRUs are reported and managed through the NERSC Information Management (NIM) system. For details on how the SRUs are calculated and managed please see the HPSS Charging page.

Troubleshooting and Further Questions

Some of the more common issues encountered by users accessing HPSS are described here. If you run into any issues or have any questions, please don't hesitate to ask for help.