HSI is a flexible and powerful command-line utility to access the NERSC HPSS storage systems. Like FTP, you can use it to store and retrieve files but it has a much larger set of commands for listing your files and directories, creating directories, changing file permissions, etc. The command set has a UNIX look and feel (e.g. mv, mkdir, rm, cp, cd, etc.) so that moving through your HPSS directory tree is almost identical to what you would find on a UNIX file system. HSI can be used both interactively or in batch scripts.
The HSI utility is available on all NERSC production computer systems and it has been configured on these systems to use high-bandwidth parallel transfers. If you want to use HSI to access NERSC storage from a remote site, you will have to install a compatible version of the software on that machine. See the Software Downloads page for instructions and access to pre-compiled binaries.
Using HSI from a NERSC Production System
All of the NERSC computational systems available to users have the hsi client already installed. To access the Archive storage system you can type hsi with no arguments:
That is, the utility is set up to connect to the Archive system by default. This is equivalent to typing:
% hsi -h archive.nersc.gov
To access Hpss (the NERSC Backup system), you must specify the name:
% hsi -h hpss.nersc.gov
Using HSI from a remote system
If you or your system administrator installed one of the pre-compiled hsi binaries distributed by NERSC on your workstation or cluster, hsi will, by default, connect to the Archive system. That is, access is the same as if using it from one of the NERSC computational systems.
If you are attempting to access the NERSC storage systems with hsi from another computer center that has it installed to access their own HPSS storage, the connection will likely fail.
You can run hsi commands in several different ways:
|From a command line:||% hsi|
|Single-line execution:||% hsi "mkdir run123; cd run123; put bigdata.0311|
|Read commands from a file:||% hsi "in command_file"|
|Read commands from standard input:||% hsi < command_file|
|Read commands from a pipe:||% cat command_file | hsi|
Just typing hsi will enter an interactive command shell, placing you in your home directory on the Archive system. From this shell, you can run the ls command to see your files, cd into storage system subdirectories, put files into the storage system and get files from it.
If using the single-line execution method, you must quote the argument with commands separated by semicolons.
Finally, you can place the hsi commands you want to execute in a text file and have hsi read and execute the commands. This can be done in several ways, either by having hsi read the command from standard input or by specifying the 'in' command. See the hsi man page for details.
Specifying local and HPSS file names when storing or retrieving files
The HSI put command stores files from your local file system into HPSS and the get command retrieves them. The command:
% put myfile
will store the file named "myfile" from your current local file system directory into a file of the same name into your current HPSS directory. So, in order to store "myfile" into the "run123" subdirectory of your home in HPSS, you can type:
A:/home/j/joeuser-> cd run123
A:/home/j/joeuser-> put myfile
% hsi "cd run123; put myfile"
The hsi utility uses a special syntax to specify local and HPSS file names when using the put and get commands:
- The local file name is always on the left and the HPSS file name is always on the right.
- Use a ":" (colon character) to separate the names
% put local_file : hpss_file
% get local_file : hpss_file
This format is convenient if you want to store a file named "foo" in the local directory as "foo_2010_09_21" in HPSS:
% hsi "put foo : foo_2010_09_21"
You can also use this method to specify the full or relative pathnames of files in both the local and HPSS file systems:
% hsi "get bigcalc/hopper/run123/datafile.0211 : /scratch2/scratchdirs/joeuser/analysis/data"
Frequently used commands
HSI has a rich command set, but most users will be able to get by with knowing a small subset. Here are some frequently used commands categorized by function:
HPSS File and Directory Commands
|cd||Change current directory|
|get, mget||Copy one or more HPSS-resident files to local files|
|cget||Conditional get - get the file only if it doesn't already exist|
|cp||Copy a file within HPSS|
|rm mdelete||Remove one or more files from HPSS|
|ls||List a directory|
|put, mput||Copy one or more local files to HPSS|
|cput||Conditional put - copy the file into HPSS unless it is already there|
|pwd||Print current directory|
|mv||Rename an HPSS file|
|mkdir||Create an HPSS directory|
|rmdir||Delete an HPSS directory|
Local File and Directory Commands
|lcd||Change local directory|
|lls||List local directory|
|lmkdir||Make a local directory|
|lpwd||Print current local directory|
|command||Issue shell command|
File and Directory Administration Commands
|chmod||Change permissions of file or directory|
|chgrp||Change group ownership for a file or directory|
Miscellaneous HSI Commands
|help||Display help information|
|quit, exit, end||Terminate HSI|
|in||Read commands from a local file|
|out||Write HSI output to a local file|
|log||Write all HSI commands and responses to a local log file|
|prompt||Toggles HSI prompting for mget, mput, and mdelete|
For full HSI documentation see http://www.mgleicher.us/GEL/hsi/. Note that not all options are implemented at NERSC.
Example script for automated transfers
Batch script examples
In the following example, the batch job performs three actions:
- retrieve the application and input files from HPSS to a local scratch file system
- run the application
- save the output files to HPSS.
Note that when saving the output files, the script makes use of a "HEREIS" document to specify the hsi commands to run.
# move to the local scratch directory where the job will be run
# extract the application binary and input file from HPSS
hsi "cd my_app_dir; get bigcode; get infile"
# run the application
aprun -n 1000 -N 2 ./bigcode infile
# define a shell variable with the current date in YYYY-MM-DD format
# save the input file and the output files to a new HPSS directory.
hsi -h archive.nersc.gov <<EOF