NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory

NERSC Global Filesystem

On this page:

1. Overview

Top

The NERSC Global Filesystem (NGF) is a large, shared filesystem that can be accessed from any of the major compute platforms. This facilitates file sharing between platforms, as well as file sharing among NERSC users working on a common project. NGF is based on IBM's General Parallel File System (GPFS). It currently contains over 450 TB of user-accessible storage.

NGF provides home directories on certain NERSC systems. It also provides project directories accessible from all NERSC computational systems. And lastly, it provides storage (in common) for NERSC-provided software on some systems.

2. Global Home Directories

Top

Global home directories (or "global homes") provide a convenient means for a user to have access to source files, input files, configuration files, etc., regardless of the platform the user is logged in to. Wherever possible, you should refer to your home directory using the environment variable $HOME. The absolute path to your home directory (e.g., /u4/elvis/) may change, but the value of $HOME will always be correct. For security reasons, you should never allow "world write" access to your $HOME directory or your $HOME/.ssh directory. NERSC scans for such security weakness, and, if detected, will change the permissions on your directories.

2.1 Platforms Utilizing Global Homes

Top

Global homes are available on Franklin, Hopper, Carver, DaVinci, and on the data transfer nodes. In addition, several internal (staff-only) development and test systems utilize global homes. It is expected that all future NERSC systems will use global homes.

2.2 Quotas and Performance

Top

Default global home quotas are 40 GB and 500,000 inodes. If you need more than that, fill out the Disk Quota Change Request Form.

Note: the myquota command is currently unable to report quota and usage information about global homes. NERSC will provide a replacement tool soon.

Performance of global homes is optimized for small files. This is suitable for compiling and linking executables, for example. Home directories are not intended for large, streaming I/O. User applications that depend on high-bandwidth for streaming large files should run in /scratch or /project.

2.3 Backups

Top

Files in global homes are not subject to purging. However, they are also not backed up, except for disaster recovery of the entire filesystem. That is, individual files and/or directories can not be restored from the disaster recovery backups.

IMPORTANT: It is the responsibility of all NERSC users to back up their files to HPSS or some other archival resource.

2.4 Dot-files

Top

Global home directories are pre-populated with startup files (dot-files) for all supported shells. The "standard" dot-files are symbolic links to read-only files that NERSC controls. For each standard dot-file, there is a user-writable ".ext" file. For example, C-shell users are generally concerned with the files .login and .cshrc, which are read-only at NERSC. These users should put their customizations in .login.ext and .cshrc.ext.

Users may have certain customizations that are appropriate for one NERSC platform, but not for others. The .ext files all have examples of how to do this, by testing the value of the environment variable $NERSC_HOST. For example, on DaVinci the Intel Fortran compiler is called ifort. A C-shell user might include the following in their .cshrc.ext file:

if ($NERSC_HOST == "davinci") then
  setenv FC ifort
endif

Occassionally, a user will accidentally delete the symbolic links to the standard dot-files, or otherwise damage the dot-files to the point that it becomes difficult to do anything. In this case, the user should run the command fixdots. This command will recreate the original dot-file configuration, after first saving the current configuration in the directory $HOME/KeepDots.timestamp, where timestamp is a string that includes the current date and time. After running fixdots, the user should carefully incorporate the saved customizations into the newly-created .ext files.

2.5 Usage Across Multiple Platforms

Top

Many users maintain application codes in their home directories. This usually consists of a set of source files, configuration files and makefiles or scripts, object files and libraries, and executable files. In addition, there might be sample input and output files for testing.

Global homes provide a mechanism where users can maintain a single copy of files that are machine independent (source files, input files, etc). Users should arrange that machine-dependent files (object files, executables, etc) are placed in separate directories. One obvious way is to create subdirectories in the global home directory, each named after a particular NERSC system.

> ls $HOME
davinci/
franklin/
hopper/

Another way is to make a directory per application and make specific system sub-directories

> ls $HOME/astro_application/
franklin_build/
hopper_build/
davinci_build/

3. Project Directories

Top

Access control for project directories is based on Unix file groups. In most situations, the name of the project directory is the same as the associated file group.

3.1 Quotas and Performance

Top

Default project directory quotas are 1 TB and 500,000 inodes. If your directory needs more than that, fill out the Disk Quota Change Request Form.

To check your current usage and quota in a project directory, use the prjquota command, specifying the name of the project directory. For example, if you have access to a project directory named "bigsci":

% prjquota bigsci
           ------ Space (GB) -------     ----------- Inode -----------
 Project    Usage    Quota   InDoubt      Usage      Quota     InDoubt  
--------   -------  -------  -------     -------    -------    -------  
  bigsci      1455     3072        0      307423     500000         20

In the above example, the project directory "bigsci" has used about 1.5 TB of its 3 TB block quota, and about 307000 inodes out of its 500000 inode quota.

The system has a sustainable bandwidth of 1 GB/sec bandwidth for streaming I/O, although actual performance for user applications will depend on a variety of factors. Because NGF is a distributed network filesystem, performance will be slightly less than that of filesystems that are local to NERSC compute platforms. This should only be an issue for applications whose performance is I/O bound.

3.2 Policies

Top

There must be a project directory administrator associated with each project directory. This user must have the NIM role of PI, PI Proxy, or Project Manager.

Unlike the $SCRATCH filesystems on the compute platforms, files in project directories are not subject to purging. However, similar to the $HOME filesystems, they are also not backed up (except for disaster recovery of the entire filesystem).

It is the responsibility of the users to back up their files to HPSS or some other archival resource.

3.3 Requesting Space

Top

NERSC allocates project directories to groups of users that need to share files among themselves and/or between machines. These users can all be members of a single repository, or collaborating members of different repositories. To request a project directory, please use the Project Directory Request Form.

3.4 Usage

Top

Project directories are created in /project/projectdirs. The name of the project directory will usually be used as the associated Unix file group. This name will sometimes be the same as a NERSC repository, and all active users of that repository (which is already a Unix file group) will thereby have access to the project directory.

However, there are cases where a repository name is not suitable for a project directory. For example, some large projects might want a project directory to be accessible by members of multiple repositories. Also, some long-term projects outlive the specific repositories that constitute them. In these cases, a project directory administrator may request the creation of a new project name. This will result in the creation of a new Unix file group consisting soley of the project directory administrator, followed by the creation of the project directory itself. The project directory administrator must then use NIM to add users to the newly-created file group (this is a very simple operation). Only these users will be able to access the project directory.

3.5 Historical Usage

Top

Project: Space Usage

Project: Percent Space Usage

4. Other Global File Systems

Top

Some NERSC systems use portions of NGF to provide space for /usr/common, where NERSC staff install software for users. NGF also provides /scratch to some systems. In both of these cases, the mounted file systems are unique to the computational system. That is, even though they are provided by a global file system, they are essentially private to a particular computational platform.


LBNL Home
Page last modified: Tue, 20 Apr 2010 17:13:37 GMT
Page URL: http://www.nersc.gov/nusers/resources/NGF/
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science