NERSCPowering Scientific Discovery Since 1974

Storage Resource Unit (SRU) Formula Coefficients

Storage Resource Units (SRU) are the unit by which NERSC allocates and track usage in HPSS. The formula for calculating SRUs takes into account (1) the number of files stored, (2) the size of the files, and (3) the network bandwidth needed to transfer files in and out of HPSS. SRUs are calculated on a daily basis and the total usage per user is the sum of the daily values. To estimate your SRU usage for a year, please use the following formula:

SRUs per year = 0.01436*(average number of files stored) + 4.787*(average space used in GB) + 4*(amount_of_data_transferred to/from HPSS in a year in GB)

History and Motivation

The coefficients in the Storage Resource Unit (SRU) formula were arrived at from the following considerations:

- The formula should help influence user behavior towards efficient use of the storage resource.
- The formula should reflect the relative costs of "doing business".

From these considerations we adopted file counts, bytes stored and I/O transfers as the 3 minimum factors that needed to be included in the formula.  Hardware costs are related to these three areas in the following ways:

1.  Costs driven by number of files:
  - Metadata CPUs, disks and backup systems
  - Additional tape drives required to overcome the latency of small file sized
2.  Costs driven by the amount of space useed:
  - Library
  - Media
  - Repack CPUs and drives
3.  Costs driven by bandwidth requirements:
  - Multiple tape drives
  - Large capacity high speed disks
  - High speed networking and network switches
  - Data transfer CPUs ("movers")

We considered NERSC's costs of storage operation and roughly assigned them to these three areas.  Ignoring file counts we found that storage and I/O accounted for roughly 40% and 60% of our costs respectively.  We decided that storage and I/O would have these rough proportions and that we would introduce file counts to be 10-20% of the overall formula.

At the time the cost of monitoring space usage was very high and required a complete directory listing.  Therefore we limited these operations to one or two per month.  This led to an accounting granularity of one month.  Based on this granularity the coefficients in the formula were set to generate 1 SRU per month for the average user with 1GB stored in the system.

The Formula

At the time the formula was set (1999) there were about 20 to 50 TB in NERSC storage.  The amount of I/O, on a monthly basis, was observed to be about 10% of the amount stored (as of 2003 it is 13% to 14%) so the formula was

SRUs = (GB stored) + 10 x (GB of I/O)

The average file size was about 10 MB so there were about 100 files per GB stored.  To cause this to account for 1/4 as much as the space charge the factor times 100 must equal about 0.25 which lead to a file factor of 0.0025.  This was subsequently raised slightly to 0.003 to have more influence on users so the hypothetical user, who stares 1 GB in 100 files and does 0.12 GB of I/O per month sees the following charges:

Number of file times .0003 = 0.3
Space stored                      = 1.0
I/O time 10                          = 1.2
Total                                       2.5

It was also decided that if would be convenient if the "average" user with 1GB stored would accrue 1 SRU/month so the above rates were scaled by 0.4, leading to

monthly user SRUs = 0.0012*files + 0.4*GB_stored + 4*GB_I/O

Later when the accounting granularity became daily the above rates were divided by 30.5 days per month to get

daily user SRUs = 0.0000393*files + 0.0131147*space(GB) + 4.0*I/O(GB)

Multiply by a typical 365 day year results in:

yearly user SRUs = 0.01436*files + 4.787*space(GB) + 4*I/O(GB)