[an error occurred while processing this directive]

NERSC 3 Greenbook

next up previous contents
Next: ERSUG Action Items Up: Specific Recommendations Previous: Encourage science of scale

Encourage Expansion of Data Intensive Computing Capabilities

For some classes of scientific endeavors in Energy Research which are very data intensive, the capability to select both large and small samples from very large data sets is as equally critical to the science as is applying compute cycles to those samples once they are selected. This will involve the development both of software and necessary networking hardware for this capability to be effective. Selecting and staging files from shell scripts via ftp is simply inadequate for large complex data sets. A true object database capability is required for the scientist to be able to select the appropriate data sets or ``data objects'' which have meaning for the scientific study. Two issues involved in this capability are optimizing data layout so that selecting small samples is efficient as well as providing adequate network bandwidth so that large samples are retrieved in a timely way.



Rick A Kendall
7/13/1998