Tertiary Storage

Keith Fitzgerald - Harvard Holmes

NERSC File Storage Group

Storage Charging: Questions and Answers

Why charge for storage?

The answer is for the same reason we charge for CPU time on our compute servers. Our resources have limitations and we have the responsibility to provide a deterministic, automated mechanism which insures that the resources are utilized as DOE desires. Our experience has led us to believe that the most critical resources in the storage environment are:

  1. Bandwidth. This limitation can show up in many areas .... network, disk cache, archive, etc. but the bottom line is that when you approach a limit, users suffer.
  2. The name server. When this is overloaded, service degrades.
  3. The archive. We've already experienced and fixed this problem. l> A good management scheme should provide a deterministic mechanism which will encourage users to optimize their utilization of the storage environment and reward those users who invest the thought and time required to optimize their usage. The only management mechanism which provides the desired capabilities is charging for storage with the ability to trade storage for computation. The other alternatives considered were quotas and separate storage allocations.

    The quota mechanism is deficient in almost every way. It protects only the archive. There is no way to influence user behavior in terms of bandwidth or name server usage, and there is no incentive to use less than the entire storage quota. Charging a separate storage allocation provides a better management mechanism because there would be a limit on the total resource consumed and a tradeoff between bandwidth, archival storage and number of files. However, there would be no incentive to use less than the entire allocation--therefore no incentive to cleanup. Costs for resources and their management vary, especially in how they scale. Charging can assist in the process of making sure that resources are used and managed in a cost effective way. Charging and the corresponding statistics can also assist in justifying future expansion and changes in configurations.

    Assuming we adopt a storage charging scheme based upon CRUs, what stops users from increasing their computational allocation by requesting storage which they don't plan to use?

    The best answer I can give is nothing stops someone from doing this once. However, the ERCAP process may not be very understanding when they submit their next allocation request - it would be pretty obvious what had happened at the end of the year.

    How would we convert to a charging scheme?

    If a charging scheme is approved, we would like to put the mechanism into place as soon as possible, but NO CRUs would be deducted from any repository until the beginning of the next fiscal year. Hopefully, you would have several months of experience with the charging scheme which could be used as the basis for your next year's ERCAP allocation request. At the start of the next fiscal year ('98) we would begin actually charging the repositories.

    Where does the superhome fit in?

    Current thinking is that a reasonable amount (more than 500MB) of global storage will be provided for each user. This storage will be free and will be managed under a quota mechanism. A user with small to moderate storage needs may never have to access the storage system directly. Users with large storage needs may want to use the superhome to satisfy their small file requirement. At some time in the future, the superhome may evolve into a DFS based capability which will also be accessible directly from your workstation.

    Review of MSS systems for NERSC

    1. Requirements
    2. Vendor Survey
    3. User Survey
    4. Preliminary Conclusions
    5. More Information
    Server Graphics:
    1. "Classical" Server
    2. "Third Party" Server
    3. "Third Party" Server with Client Movers
    4. "Third Party" Server with Pooled Movers

    Requirements>

    User Level Requirements