NERSCPowering Scientific Discovery Since 1974

NERSC Data Storage Resources

Overview

NERSC file systems can be divided into two categories: local and global. Local file systems are only accessible on a single platform and provide the best performance; global file systems are accessible on multiple platforms, simplifying data sharing between platforms. 

File systems are configured for different purposes. On each machine you have access to at least three different file systems

  • Home: Permanent, relatively small storage for data like source code, shell scripts, etc. that you want to keep. This file system is not tuned for high performance for parallel jobs. Referenced by the environment variable $HOME.
  • Scratch: Large, purgeable, high-performance file system. Place your large data files in this file system for capacity and capability computing. Data is purged as described below, so you must save important files elsewhere (like HPSS). Referenced by the environmental variable $SCRATCH.
  • Project: Large, permanent, medium-performance file system. Project directories are intended for sharing data within a group of researchers. 
  • Burst Buffer: Temporary, flexible, high-performance SSD file system that sits within the High Speed Network (HSN) on Cori. Accessible only from compute nodes, the Burst Buffer provides per-job (or short-term) storage for I/O intensive codes. 
  • Archive: A high capacity tape archive intended for long term storage of inactive and important data. Accessible from all systems at NERSC. Accessible space is controlled by a group's SRU allocation.
  • Global Common: A performant platform to install software stacks and compile code.

Summary of File System Policies

The following table summarizes the default storage capacity available automatically for every NERSC user. If you have data needs beyond these defaults you can request a quota increase. For extreme data needs you may want to consider purchasing storage through our sponsored storage program. If you have any questions please feel free to contact the NERSC help desk.

File SystemPathTypePeak PerformanceDefault
Quota
BackupsPurge Policy
Global homes  $HOME  GPFS Not For IO Jobs 40 GB
1,000,000 Inodes
Yes Not purged
Global project  /project/projectdirs/projectname  GPFS 130GB/Second 1 TB
1,000,000 Inodes
Yes if ≤ 5 TB quota. No if quota is > 5 TB. Not purged
Global common /global/common/software/projectname GPFS -

10 GB 1,000,000 inodes

No Not purged
Edison local scratch $SCRATCH Lustre 168GB/Second (across 3 file systems)  10 TB
5,000,000 Inodes
No Files not accessed for 8 weeks are deleted 
Cori local scratch $SCRATCH, $CSCRATCH (from other systems) Lustre 700GB/Second 20 TB
10,000,000 Inodes
No Files not accessed for 12 weeks are deleted
Cori Burst Buffer $DW_JOB_STRIPED, $DW_PERSISTENT_STRIPED_XXX DataWarp 1.7 TB/s, 28M IOP/s none No Data is deleted at the end of every job, or at the end of the lifetime of the persistent reservation
Archive (HPSS) Typically accessed via hsi or htar inside of NERSC HPSS 1 GB/s to disk cache Allocation dependent No Not purged

File Systems' Intended Use

File SystemIntended UseFile Optimization
Global homes Hold static executables, configuration files, etc. NOT meant to hold the output from your application runs; the scratch or project file systems should be used for computational output. Optimized for small to medium sized files.
Global project Sharing data within a team or across computational platforms. Store application output files. Intended for actively used data. Optimized for high-bandwidth, large-block-size access to large files.
Global common Sharing a software stack across a team or across computational platforms. Executables with shared libraries (e.g. python) will perform best in these directories. Mounted read-only on the compute nodes for optimum performance. Optimized for high-bandwidth, small-block-size access to small files.
Scratch file systems Edison and Cori each have large, local, parallel scratch file systems dedicated to that systems. The scratch file systems are intended for temporary uses such as storage of checkpoints or application input and output. If you need to retain files longer than the purge period, the files should be copied to global project or to HPSS. Optimized for high-bandwidth, large-block-size access to large files.
Burst Buffer Cori's Burst Buffer provides very high performance I/O on a per-job or short-term basis. It is particularly useful for codes that are IO-bound, for example, codes that produce large checkpoint files, or that have small or random I/O reads/writes.  Optimized for high-bandwidth access for all size files and all access patterns. 
Archive Long term archival of important and unique data, such as data from published results. Optimized for file sizes or bundles of files in the 100s of GB range.

The following table shows the availability of the various file systems on each of the primary NERSC platforms.

File SystemEdisonCoriGenepoolData Transfer NodesPDSF
 Global homes 
Y
Y
Y
Y
Y
 Global project
Y
Y
Y
Y
Y
 Global Common
Y
Y
Y
Y
Y
 Global projectb
Y
Y
Y
Y
 
 Local scratch
Y
Y
Y
   
 Burst Buffer
 
Y
     
 Archive
Y
Y
Y
Y
Y

Finding your disk usage

NERSC provides the myquota command which allows you to find your current disk space and inode usage for all file systems.

user@edison02: myquota 

FILESYSTEM   SPACE_USED   SPACE_QUOTA   SPACE_PCT   INODE_USED   INODE_QUOTA   INODE_PCT
escratch     3.15GiB      0.00          N/A         42.51K       20.00M        0.2%
escratch3    4.00KiB      100.00TiB     0.0%        0.00         10.00M        0.0%
cscratch1    29.61TiB     20.00TiB      148.1%      416.28K      10.00M        4.2%
home         25.25GiB     40.00GiB      63.1%       66.93K       1.00M         6.7%
bscratch     0.00         20.00TiB      0.0%        1.00         4.00M         0.0

 

The prjquota command gives you the usage for a project directory.

user@edison02: prjquota mpccc
                  ---------- Space (GB) ---------     ------------- Inode --------------
Project             Usage      Quota    Percent          Usage       Quota     Percent
---------------   ---------  ---------  ---------     ----------  ----------  ----------
mpccc                 21490      30720         70       14000370    20000000          70

 The cmnquota command gives you the usage for a global common directory.

user@edison02: cmnquota mpccc
                  ---------- Space (GB) ---------     ------------- Inode --------------
Project             Usage      Quota    Percent          Usage       Quota     Percent
---------------   ---------  ---------  ---------     ----------  ----------  ----------
mpccc                     0         10          0            182     1000000           0

Your Archive usage can be viewed by going to your project's page in NIM (nim.nersc.gov).

Purge Policy

In order to provide equitable access to scarce file system resources, NERSC "purges" certain scratch file systems on a regular basis, as indicated in the above table. Scratch file systems are intended to provide temporary storage on high performance hardware. Data that is not in active use should be placed in HPSS.

Users are not allowed to run commands (e.g., "touch" scripts) whose sole purpose is to update the time-of-last-access on purgeable files.

NERSC monitors I/O rates and I/O performance on the local Lustre file systems. Users can see Completed Jobs (Click any jobid, then click Lustre I/O)

 

Global Home Filesystem

Global home directories (or "global homes") provide a convenient means for a user to have access to source files, input files, configuration files, etc., regardless of the platform the user is logged in to. Wherever possible, you should refer to your home directory using the environment variable $HOME. Read More »

Global Common File System

The project file system is a global file system available to all NERSC computational systems. It allows groups of NERSC users to store and share data. A directory in the project file system is available by default for each repository. Read More »

Project File System

The project file system is a global file system available to all NERSC computational systems. It allows groups of NERSC users to store and share data. A directory in the project file system is available by default for each repository. Read More »

Disk Quota Increase Request

Use this form to request quota increases. You need to log in using your NIM password. Read More »

Project Directory Request Form

PIs and PI proxies, please submit your project directory request in NIM. Read More »

Sponsored Storage

To accommodate projects that need storage in the NERSC Global Filesystem (NGF) beyond what NERSC can provide in its base allocations, Principal Investigators (PIs) can request to purchase a sponsored storage allocation.  A sponsored storage allocation provides 50 TB or more in 10 TB increments in the /project or /projecta file systems.  The file systems are capable of a bandwidth of 10 GB/sec per 1 PB of storage, and support high bandwidth parallel I/O workloads from all the computational… Read More »

Frontiers in Advanced Storage Technologies (FAST) project

  Working with vendors to develop new functionality in storage technologies generally not yet available to industry.  The NERSC project involves selecting particular technologies of interest, partnering with the vendor, assessing their hardware, and providing feedback or co-development to improve the product for use in HPC environments. The FAST project involves establishing long-term development collaboration agreements to develop the following opportunities: File system acceleration.  A… Read More »