The answer is for the same reason we charge for CPU time on our compute servers. Our resources have limitations and we have the responsibility to provide a deterministic, automated mechanism which insures that the resources are utilized as DOE desires. Our experience has led us to believe that the most critical resources in the storage environment are:
The quota mechanism is deficient in almost every way. It protects only the archive. There is no way to influence user behavior in terms of bandwidth or name server usage, and there is no incentive to use less than the entire storage quota. Charging a separate storage allocation provides a better management mechanism because there would be a limit on the total resource consumed and a tradeoff between bandwidth, archival storage and number of files. However, there would be no incentive to use less than the entire allocation--therefore no incentive to cleanup. Costs for resources and their management vary, especially in how they scale. Charging can assist in the process of making sure that resources are used and managed in a cost effective way. Charging and the corresponding statistics can also assist in justifying future expansion and changes in configurations.
Assuming we adopt a storage charging scheme based upon CRUs, what stops users from increasing their computational allocation by requesting storage which they don't plan to use?
The best answer I can give is nothing stops someone from doing this once.
However, the ERCAP process may not be very understanding when they submit
their next allocation request - it would be pretty obvious what had happened
at the end of the year.
How would we convert to a charging scheme?
If a charging scheme is approved, we would like to put the mechanism into
place as soon as possible, but NO CRUs would be deducted from any repository
until the beginning of the next fiscal year. Hopefully, you would have
several months of experience with the charging scheme which could be used
as the basis for your next year's ERCAP allocation request. At the start of
the next fiscal year ('98) we would begin actually charging the repositories.
Where does the superhome fit in?
Current thinking is that a reasonable amount (more than 500MB) of global storage will be provided for each user. This storage will be free and will be managed under a quota mechanism. A user with small to moderate storage needs may never have to access the storage system directly. Users with large storage needs may want to use the superhome to satisfy their small file requirement. At some time in the future, the superhome may evolve into a DFS based capability which will also be accessible directly from your workstation.
Requirements>
User Level Requirements
Technical Requirements
Performance:
User Survey: Summary of Operations at Other Centers>
Preliminary Conclusions>
Leading contenders are HPSS, DMF, and FileServ
Time Table:
High Performance
Many Files
Large Files
Existing Hardware
Pass/Fail
HPSS
Yes
Yes
Yes
Yes
Pass
Convex UniTree
Yes
Yes
Yes
No
Maybe-hw
DMF
Yes
Yes
Yes
Yes
Pass
FileServ
Yes
Yes??
Yes
No
Maybe-hw
AMASS
No
No
??
No
Fail
Epoch
No
No
No
No
Fail
ADSM
??
Yes
??
Yes
????
SAM-FS
Yes
Yes
Yes
No(Sun/SGI only)
Maybe-hw
CA-Unicenter
Yes??
Yes??
??
No(no Cray client)
Fail
Metior
Yes
Yes
Yes
No
Maybe-hw
Data Migrator
--
--
--
No
Fail
OSM
No
No
No
Yes
Fail
MastarMind
No
No??
Yes
No
Fail
Site
Supercomputer
Mass Storage
Comments
Argonne/ECT
--
none
plans based on DFS
Brookhaven/RHIC Planning
UNIX farm
none
Metior, HPSS planned
Cornell Theory Center
IBM SP2(512)
HPSS
--
ECMWF
Fujitsu VPP
--
--
Fermilab
--
HPSS
--
Jefferson Lab (CEBAF)
--
OSM
--
LLNL
T3D, SP2, DEC
UniTree
NSL UniTree
HPSS
--
LANL, ACL
many
CFS, HPSS
--
Maui HPCC
IBM SP(563)
HPSS
--
NASA Goddard
--
Convex UniTree
--
NCSA
Convex, SGI, CM-5
Convex UniTree
--
Oakridge NL (CCS)
--
HPSS, UniTree
HPSS testbed
PNNL (EMSL)
--
FileServ HSM
--
Pittsburgh SC
Cray
DMF
recently upgraded
SDSC
Cray, Intel
HPSS, UniTree
HPSS on IBM SP2
HPSS
DMF
FileServ
$$$
$2.5M (recent upgrade)
??
$2.4M(@PNNL)
What You Get
2 libraries
25 tape drives
750GB disk
control CPU
--
1 library
8 tape drives
400BG disk
2 big SGI systems
Performance
Outstanding
Excellent
Excellent
Advantages
scalability
single system image
existing user base
desktop
support<
br>supports a variety of
computing environments
stable
well supported
single-system image
existing user base
turnkey system
supports a variety of
computing environments
Staffing
some
some
little
Conversion Effort
easy - UniTree
moderate - CFS
hard
unknown
Futures
bright
limited to Cray environments
unknown at best