This section
contains information about the following topics:
- How elements
that make up an HPSS tree fit together.
- How directories
can be organized to track tasks and projects.
- How to
use paths to access specific nodes in the structure.
- How to
use working directories to make it simpler and more convenient to access
specific nodes.
- How to
use different names for HPSS files on a local computer.
- How to
use wild cards.
- What information
each type of node contains.
HPSS supports
a hierarchical storage structure called a tree. Before you can store any files
in an HPSS tree, you must first have write access to your home directory. If
you need to save only a few files, you will not need to learn much about tree
structures. You can use a simple tree with your home directory as the root
as shown in the following example.
You can establish this simple type of root directory (that is, your
home directory by entering
Note that this is normally not necessary, because the HPSS administrator
usually creates your home directory when your account is established.
Once the root (i.e., your home directory) is established, you can store files on HPSS by entering
You can retrieve
files by entering
If you want
to use the HPSS tree structure for organizing your files, then you should
have a working knowledge of the following:
- A tree
is made from nodes connected by branches that indicate the hierarchy.
- The nodes
may be directories, subdirectories, or file descriptors.
- The origin
of the tree is the root node or root directory "/".
- Branching
out from the root are descendant nodes, which may be subdirectories or
file descriptors.
- The term
directory node refers to a subdirectory. Only directory nodes may have
descendants.
- A directory
contains a list of all its immediate descendant nodes.
- A subtree
is any node in a tree and all of its descendants.
- The parent
of a node is at the next higher level in the tree. The root node "/" has
no parent.
- A home
directory is the starting point within the tree at which a particular
user's subtree begins.
- A file
descriptor contains information about a file, such as the actual location
of the file itself in storage. A file descriptor has no descendants.
To organize multiple
files and projects, you might use a complex tree with a home directory and
several subdirectories with files branching from them. The following example
shows you how these HSI commands are used to build a portion of a complex
tree structure:
- Use the
MD command (or one of its
Aliases, ADD or MKDIR)
to add subdirectory nodes to the home directory and to other subdirectories.
- Finally,
use the SAVE and STORE commands to add file descriptor nodes as files
are transferred to HPSS.
Create the
subdirectory "flowcode" within the home directory (/u/foo in this example,
for user named `foo'):
Change to the
"flowcode" subdirectory and add the subdirectories "version1" and "version2"
cd flowcode; MD version
1; MD version2
Change to the
subdirectory "version1" and add subdirectory "run"
Save the files
"t396" and "t425"
Change to the
subdirectory "version2" and add the subdirectory "run"
cd /u/foo/flowcode/version2;
MD run
Save the file
T318 under the "version2" subdirectory
In the example,
- The directory
node "version1" has "flowcode" as a parent and "run" as an immediate descendant;
- The subtree
of "version1" consists of the directory node "run", which in turn consists
of the file nodes "t396" and "t345";
- The directory
node "version2" has "flowcode" as a parent and "run" as an immediate descendant;
- The subtree
of "version2" consists of the directory node "run" and the file node "t318".
The previous
example shows you how multiple software development projects might be organized,
with multiple versions of source code, output, etc. Under the directory
node, FLOWCODE, separate versions of the source code and executable code
are kept along with input files. Note that nodes under different subdirectories
may have the same names because it is only the complete path that must be
unique.
It is also
possible to create directory nodes in other than the home directory, usually
as a means of sharing files among a group of persons working on a common
project. See Section 6 - Security and File Sharing for more information
Should your
projects change in size or scope, you can reorganize tree structures:
- The MV
command is a powerful tool for changing your HPSS tree structures from
one form to another. This command allows you to shift subtrees both within
a tree and between trees in an almost unlimited way. See Section 7, Commands,
Keywords, and Parameters, for a detailed description of MV.
- The LN
command allows you to indirectly reference files and directories via symbolic
links.
[ PAGE TOP ]
A
path
is a sequence of node names, each preceded by the forward slash (/) character:
/subdirectory/subdirectory/filename
To refer to
a file node or directory node in an HPSS tree, you must specify the name
of the complete path starting from the root node down.
- If the
first character of a path is a slash (/) the path is considered the complete
path to the HPSS file or directory.
- If the
first character of a path is not a slash, HSI constructs a complete path
by prefixing the path with the contents of one of the working directories
(see Working Directories in this section).
The maximum
length of each node name is 255 characters. The maximum length of a path,
including slashes, is 1023 characters. See Path Parameter in Section 7 for
further information.
The path may
point to a root directory, subdirectory, or file node, depending on the
function of the command. If the path includes a file, it will be the rightmost
node name in the path. In the preceding example, the complete path to subdirectory
RUN is
/u/foo/flowcode/version2/run
(in this
example, user foo's home directory is /u/foo)
The complete
path to the file T318 is
/u/foo/flowcode/version2/run/t318
[ PAGE TOP ]
5.3
UNIX-style Pathname Notation The following
UNIX-style prefix notation may be used in any localfile or HPSS pathname:
| . |
A single
period refers to the current directory |
| .. |
A double-period refers to the parent of the
current working directory. Multiple
levels in the tree may be traversed by repeated specification of
"..", e.g., "ls ../../mydir." |
| ~ |
Refers
to the home directory for the user. |
[ PAGE TOP ]
HSI provides working
directories to eliminate the need to type the complete path every time you
refer to an HPSS file or directory. Working directories allow you to store
all or part of a path to be used with later requests using a shorthand notation
in place of the complete path. Paths can be stored in each of ten working
directories (DIRn)
- for the
duration of a single request,
- for the
duration of your current HSI session, or
- as part
of a keyset for long-term use, using a KEEP request.
The contents
of a working directory is set using the DIRn keyword, where n may
be 0 through 9 (the ten working directories being named DIR0, DIR1, ...DIR9).
These may be abbreviated as D0, D1, ...D9.
- DIR0 is
the primary, or default, working directory and may be referred to as DIR
or D0. The primary directory's use is very important in HPSS. See the
Working Directory Rules of this section for the benefits of using the
primary working directory.
- DIR0 is
set to your home directory as a root path (for example, "/u/foo") when
you start HSI.
- The other
nine working directories are empty when you start HSI.
How you
set DIRn is important.
- If the
value of a working directory (DIRn) is set with a SET command, it will
be in effect for the duration of your session with HSI or until changed
by another SET or by an ADOPT command.
- If DIRn
is used within any other type of HSI request, it will be set for the duration
of that request only.
The n prime
notation (n') uses a DIRn path value, where n is the number of
the working directory in which the path was stored. For example, you can
set DIR2 by entering
dir2
= /u/dobbs/dirname : : :(a SET request)
You can then
use DIR2 in a request as follows:
This request
will retrieve the file /u/dobbs/dirname/fileit from HPSS.
Though a working
directory setting is in effect until reset or until you end HSI, one can
be temporarily overridden by setting it in a request. In the following example
DIR2 is set to a different subdirectory for the duration of the GET request
only.
get
2'test1 2'test9 dir2=/u/foo/output
Note that
after this GET request, DIR2 is still set to /u/dobbs/dirname.
[ PAGE TOP ]
5.5 Working Directory Rules Important rules
for using working directories and interpreting paths are summarized in the
following table. Details about each of the rules follow the table.
| Path format |
Description |
| /path |
Complete path (starts with a slash) of the form: /sub1/sub2/...filename
|
| Using Primary Working Directory (DIR)
|
| path |
Consists of node names to be added to the primary working
directory DIR to form the complete path. |
| - or blank |
Uses complete path taken entirely from primary working
directory DIR. Blank is valid only for LIST. |
| -k |
Forms the complete path using the primary working directory
DIR, removing the right-most k nodes. |
| Using Any Working Directory (DIR0-DIR9)
and the prime notation (n'), for n = 0 to 9 |
| n'path |
Consists of node names to be added to the working directory
DIRn, to form the complete path. |
| n' |
Uses the complete path taken entirely from DIRn. |
| n'-k |
Forms complete path using DIRn and removes the rightmost
k nodes. |
[ PAGE TOP ]
5.6
Pathname Resolution Rules
- If the
first character of an input path is a slash (/), the path is taken to
be the complete path to the file or directory.
- If the
first character of an input path is not a slash, then a path will be formed
with a working directory, using one of several shorthand notations defined
in the following rules to form the complete path.
- If the
first characters are 0', 1',...,9' (the n' notation), then HSI uses that
working directory value for the beginning part of the path. For example
set d6 = /first/second then save 6'third becomes save /first/second/third
- If the
first digits follow the -k format (-1, -2, etc.), then the primary working
directory (DIR0) is used with the rightmost k nodenames removed. For example,
set dir0 = /abcd/subd/subsubd then list -2 becomes list /abcd
- Paths of the form n'-k combine the effects of rules
3 and 4, where the rightmost k nodenames are removed from the path in
DIRn. To repeat the preceding example using DIR3: set dir3 = /abcd/subd/subsubd
then list 3'-2 becomes list /abcd
- If the
path is just a minus sign (-), then the complete path is the path currently
in the primary working directory (DIR0). An exception to this rule occurs
when you assign a value to a working directory (DIRn=-). In this case,
the minus sign means that the system default value is assigned. For DIR1-9,
this clears any keyword value and for DIR0 resets it to your home directory.
A blank in a working directory assignment is not allowed. If no path is
specified with certain commands, the contents of the primary working directory
is used as the path. However, certain commands require that a path be
explicitly stated.
- When none
of the previous conditions is met, the primary working directory DIR0
is used, which is the simplest and most common case and is what makes
the primary working directory so useful. You can GET and SAVE files by
using filenames alone, because by default, you use the contents of the
primary working directory, and, by default, it contains your home directory.
You can set DIR0 yourself to other paths, and HSI will use it automatically
when another working directory is not explicitly specified. For example,
set d0 = /greenroot/projectc then save fortx will cause the local file
`fortx' to be stored on HPSS in /greenroot/projectc/fortx
[ PAGE TOP ]
5.7 Assigning Values to Working Directories When assigning
values to working directories, "DIRn = path", the path on the right-hand side
can take any of the valid forms described in the preceding paragraphs (note
the exceptions for the minus sign and blank). The path contained in a working
directory can be partial or complete, but it cannot be of the form
localfile : path.
Following are some valid examples where:
dir0
= /redroot/redpgm/redexec
and
When the command DIR2 = 1'next_level is issued, working directory 2' becomes the value of working directory
1' plus the node `next_level'. Thus, DIR2 is:
When the command DIR3 = 2'-2 is given, working directory 3' is set to the value of working directory 2',
minus its last two nodes, or:
When the command DIR4 = -1 is issued, working directory 4' is set to the value of the default working directory
(0'), minus the last node, thus:
If D0 = - is issued,
the primary working directory is reset to the default value, your HPSS home
directory.
Working directories
simplify the use of HPSS trees. For example, if you are working with the
FLOWCODE subtree in the example from section 5.1 (Complex Tree), you might
first define the primary working directory by
d0 = /u/foo/flowcode/version1/run
You can access
nodes within that subdirectory using the following series of requests:
To traverse
the subtree, you enter the following request:
This form
temporarily sets DIR back one level to the `version1' subdirectory, then
appends `src', resulting in the following complete path for this request
only:
/u/foo/flowcode/version1/src
If more than one
subdirectory or path is being referenced at a time, you may want to use multiple
working directories in a request. For example, if you also need to work in
the "version2" subtree, you could define
d1 = /u/foo/flowcode/version2/run
Use the working
directory n' notation to specify the request
to copy file
/u/foo/flowcode/version1/run/t318
to form the new
file
/u/foo/flowcode/version2/run/t318a
The following table summarizes the rules for using the working directories
and also highlights the differences between DIR0 and the other working directories
(DIR1-DIR9).
| Description |
DIR or D0 |
DIR1 or D1* |
| Path is originally defaulted as |
userid |
Empty |
| Working directory is prefixed to node A by writing |
A or 0'A |
1'A |
| Working directory is used as a complete path by writing |
blank, -, or 0' |
1' |
| Can have the rightmost k node names removed by writing |
-k or 0'-k |
1'-k |
| Can be set to its original (system) default by writing |
DIR=- |
DIR1=- |
| At which point, its value will be |
HPSS home dir |
Empty |
| *DIR2 through 9 have the same definition
|
[ PAGE TOP ]
5.8
Defining Local Files to HSI In requests such
as GET, SAVE, and STORE, both the name of the local file on your computer
and the name of the file on HPSS must be known by HSI. If these names are
the same, only the HPSS filename is required in the request because the last
node name in the HPSS path for these commands is considered to be the name
of the local file. However, if the names differ, both names must appear in
the request as follows:
Note that as of HSI version 2.4, the colon character must be separated
from both the local and HPSS filenames by whitespace. Whitespace was optional
in previous versions of HSI. However, this led to ambiguity and confusion
when trying to use filenames which contained colon characters (e.g., filenames
with timestamps as part of the filename), and, was exacerbated by the multi-HPSS
logical drive letter syntax, which consists of a drive letter followed by
a colon, e.g.,
a:myfile
You might want
the names to differ
- if you
already have a local file with the same name as the HPSS file being retrieved,
- if you
want to give an HPSS file a more descriptive (longer) name than the local
file, or
- because
the local file name is longer than 255 characters (the HPSS maximum).
Using "-" as the
localfile name indicates that input for the file transfer will come from standard
input (STDIN) or that output from the file transfer will go to standard output
(STDOUT).
Note: Input/output redirection via STDIN/STDOUT
can be used to connect HSI to TAR or other archiving utilities, via
Unix pipes. In situations where entire directories are to be
stored or retrieved, this enables the aggregation of many small files
into one larger one that is more efficient to store and transfer. See
Chapter 9, Tricks, Tips, and Helpful Hints for more on such
matters.
To illustrate,
if a localfile has the name TEST, it can be saved in root directory `/projectj'
using the command shown in the following example:
Example: Defining
Local Files to HSI
save
test : /projectj/testdata851014x
Note that because
HPSS names may have up to 255 characters, more descriptive information was
added, including a date code, "851014," and a version designator, "X."
To retrieve
the file and restore the original localfile named `test', you could use
the command
get
test : /projectj/testdata851014x
dir6
= /projectj/testdata851014x
As noted above,
the space on either side of the colon is required by HSI version 2.4 and
beyond (use the version
command to see which version of HSI you are using). Note: localfile
always appears to the left of the colon.
[ PAGE TOP ]
Wild cards provide
a "shorthand" path notation for selecting one or several file or directory
names in HSI requests. They ease the job of fully specifying particular files
or groups of files. The presence of the special wild card characters in a
path causes node names to be selected that "match" the given pattern. HSI
expands a "wild path" (a path containing wild card characters) into a set
of paths, then applies your command to each path.
For example, it is possible to GET
all the files beginning with `abc' and ending with 'xyz' by specifying the
name as abc*xyz.
The special wild card characters are *, ?, [ ], {}, and ^. They have
the following meanings:
| Character |
Function |
| * |
will match any string of any length, including an empty (null)
string. |
| ? |
will match any single character, excluding the null character.
|
| [ ] |
enclosing a sequence of characters in brackets matches the set
of single characters enclosed. Enclosing two characters separated
by a minus matches any character within the range of the two.
|
| { } |
Encloses lists of comma-separated patterns. These may be nested.
|
| ^ |
functions as a negative or "not" operator. |
To illustrate
the use of wild cards by examples, a directory containing the following
files will be used:
| a |
ab |
AB |
ABC |
| f |
foobar1 |
foobar2 |
foobar3 |
foobar4 |
Note: because
there are so many examples in this section, the "example: " prefix will
not be used.
[ PAGE TOP ]
5.9.1 *(Asterisk) Character The wild card
character * matches any string of any length, including zero-length strings.
If you want to list just your
foobar files, the request
will return, in
alphabetical order, the listings for
foobar, foobar2,
foobar3, and foobar4
[ PAGE TOP ]
5.9.2 ? (Question Mark) Character The wild card
character ? matches only single, non-null characters.
The request
returns a listing for the file
ab but the request
returns listings
for all names beginning with
a, regardless of the length:
Note that case is significant, and the these requests do
NOT return listings for files
AB, or
ABC.
[ PAGE TOP ]
5.9.3 [ ] (Square Brackets) Characters The wild card
characters [ ] are used to delimit a range of characters or a set of single
characters that are to be matched.
To match any
character from `X' through `Z' in a certain position in a path, use
The expression
will match 1 or
3 or 5.
Note that brackets
cannot be nested as in
but you may use
multiple sets of sequential brackets as in
The request
returns listings
for
foobar2, foobar3, and foobar4
but the request
returns listings
for only
A range that is
not valid defaults to a set of single characters that may be matched. The
invalid range
maps to the characters
-, 2, and 4.
[ PAGE TOP ]
5.9.4 { } (Curly Brace) Characters Curly braces are
not actually wild card characters; instead, they are used to contain lists
of patterns, separated by commas. They may also be nested, as long as there
is a matching "}" for each opening "{".
The pattern
would match
[ PAGE TOP ]
5.9.5 ^ (Caret) Character The wild card
character ^ provides a complement or "not" operation. It can be used only
in the first position in a wild path or immediately following a left bracket.
The request
returns listings
for everything except `f':
a, ab, ABC, foobar1, foobar2, foobar3, and foobar4
The request
returns listings
for all the foobars except `foobar2':
foobar1, foobar3,
and foobar4
The request
returns listings
for everything that wasn't returned by the request list `foobar[^2]':
a, ab, AB, ABC, f, and foobar2
[ PAGE TOP ]
5.9.6 Commands That Work With Wild Cards Wild cards can
be used anywhere that a local or HPSS pathname is referenced. In some cases,
the pathname must resolve to a single node name (e.g., set dir=~/myp*). In
these cases, using wild cards may be thought of as simply a shorthand notation
for the nodename.
There are special
considerations when using wild cards with GET. Use of the localfile option
in a GET request is restricted for wild cards.
The request
(note: spaces surrounding the ":"are required as of HSI version 2.4)
will work if the
evaluation of the wild card expression results in only one matching path.
If multiple paths match, the request is aborted because the files would all
be written to the same local file ABC.
GET requests
for files with HPSS names longer than what is permitted on some UNIX-type
systems result in the names being truncated without an error message.
Note: Such name trunctation can cause files to be
overwritten if truncated names collide with local names in a "get"
request.
See Operating
System Considerations for One-Liners in Section 7 for related information.
[ PAGE TOP ]
5.9.7 Multilevel Wild Cards Wild cards may
be used in any and all levels in a wild path, although the previous examples
showed wild cards in only one node of the wild path.
Following is
a legal request
list /usr/local/bin/*ABC/??def*/xyz[234]hi*
Wild card expressions
cannot span nodes in the path. Each directory level must be represented. The
request
cannot be shortened
to
[ PAGE TOP ]
5.9.8 Wild Cards in Working Directories The DIRn keyword
can be set to a wild path, but it must resolve to a single HPSS nodename.
In this context, use of wild cards may be thought of as a shorthand notation
(perhaps to save typing) for specifying the node name.
[ PAGE TOP ]
The maximum length
of a path, including wild card characters, is 1023 characters.
A wild path used in a one-liner on a UNIX-based system should be enclosed
in single or double quotes or the wild card characters should be preceded
by the escape character (\). See Operating System Considerations for One-liners
in Section 7.
[ PAGE TOP ]
The alphabetical
or collating order used for the range tests and for the node output sequence
when using wild cards is
. + $ * - % _ # a-z 0-9
Note: The ordering of characters differs among operating systems,
and does not necessarily follow the ASCII collating sequence. In general, HSI will try to
resolve the meanings of "special" characters, before it assembles paths. This means that
- wild card characters, such as "*" and "?"
- characters that have reserved meanings within HSI, such as " ' " (single-quote)
and "-" (hypen)
- bracket pairs "[ ]" and "{ }" (often denoting character sets and intervals)
will be handled before the alphanumeric and other legal path characters. Therefore,
the results of commands containing wild card characters may differ across operating
systems.
The path name will not be output in an error message when you are trying to use HPSS
directories and files to which you have not been given access. If you enter the request
it will return only the nodes to which you have access. You will see nothing for the
nodes you cannot access. If you do not have access to any nodes, you will see only
the prompt when the request is finished.
[ PAGE TOP ]
Use of the
RM
command can be dangerously destructive; wild cards can make it even more so. You may wish
to use the
MDELETE
command instead, which prompts for approval of each deletion prior to issuing
the HPSS command to remove the file.
[ PAGE TOP ]
Noticeable delays
may occur while HSI searches for the paths that match your wild card expression,
depending upon how precisely the path is specified and when the nodes that
satisfy the expression are encountered in the search.
For example,
the request
might have to
read most of the nodes in your directory. However, the request
would read fewer nodes because the path is more precisely specified. In addition,
delays may occur during wild card operations when an internal buffer needs
to be refilled with more paths that match your wild card expression, or
when memory needs to be expanded.
Note: HSI is more efficient at "meta-data"
operations - those which change only the state of files and directories
- than other utilities, such as FTP. This is because FTP must, for
instance read, rewrite, and delete a file in order to relocate it
within a directory hierarchy; HSI, on the other hand, simply modifies
certain database entries, without actually needing to access the
affected file. Other meta-data operations involving file and directory
ownerships and access permissions are also efficiently done in HSI.
[ PAGE TOP ]