IBM and Department of Energy Supercomputing Center to Make DOE Grid Computing a Reality
DOE Science Grid To Transform Far-Flung Supercomputers into a Utility-like Service
March 22, 2002
ARMONK, NY and BERKELEY, CA, March 22, 2002 -- IBM and the U.S. Department of Energy's (DOE) National Energy Research Scientific Computing Center (NERSC) today announced a collaboration to begin deploying the first systems on a nationwide computing Grid, which will empower researchers to tackle scientific challenges beyond the capability of existing computers.
Beginning with two IBM supercomputers and a massive IBM storage repository, the DOE Science Grid will ultimately grow into a system capable of processing more than five trillion calculations per second and storing information equivalent to 200 times the number of books in the Library of Congress. The collaboration will make the largest unclassified supercomputer and largest data storage system within DOE available via the Science Grid by December 2002 -- two years sooner than expected.
The Grid will also give scientists around the country access to far-flung supercomputers and data storage in the same way that an electrical Grid provides consumers with access to widely dispersed power-generating resources.
"Computing and data Grids will establish a uniform computing and data handling environment -- independent of location -- that can be integrated with scientists' work environment in much the same way that the Web provided a way to integrate on-line documents into the scientific work environment," said Horst Simon, director of the NERSC Division at Lawrence Berkeley National Laboratory. "Undertaking such a large and long-term project, we are especially pleased to be working with IBM, which has made Grid Computing central to its e-business strategy."
Simon added, "Connecting supercomputer centers to Grids will provide the scientific community with a much more capable set of computing and data management tools than those available today, and tools that can be used more easily and effectively than today's tools. This should have a substantial productivity benefit for scientific R&D, and will open up entirely new avenues of exploration."
"The DOE Science Grid is a template for the kind of system that can enable partnerships between public institutions and private companies aimed at creating new products and technologies for business," said Peter Ungaro, vice president, high-performance computing, IBM Servers Group. "This collaboration between IBM and NERSC is a big step forward in realizing Grid's promise of delivering computing resources as a utility-like service."
The Emerging Grid
Grids allow geographically distributed organizations to share applications, data and computing resources. An emerging model of computing, Grids are built with clusters of servers joined together over the Internet, using protocols provided by the Globus open source community (Globus.org) and other open technologies, including Linux.
The DOE Science Grid's goal is to enhance the ability of DOE scientists to explore the physical world through computational simulation and scientific experiments and analysis of the resulting data. The Science Grid will enable scientists at national laboratories and universities around the country to perform ever-greater calculations, manage and analyze ever-larger datasets, and perform ever-more complex computer modeling necessary for DOE to accomplish its scientific missions. In the future, supercomputers, data storage and experimental facilities at Lawrence Berkeley, Argonne, Oak Ridge and Pacific Northwest national laboratories are expected to be connected to the DOE Science Grid.
The Grid will give scientists real-time access to the trillions of bytes of data that are stored at national labs around the country. This kind of seamless access to information is required for large-scale projects such as genomic and astrophysics research, which generate much more data than can be stored in a single location.
As it evolves into a reliable infrastructure supporting scientific R&D, the DOE Science Grid will also facilitate development and use of collaboration tools that speed up research and allow scientists to tackle more complex problems. NERSC is located at DOE's Lawrence Berkeley National Laboratory, which has been developing distributed collaboration and distributed data handling technology for the past 10 years. This decade-long effort provided some of the precursor Grid tools and technologies.
"The combination of NERSC and the DOE Science Grid should provide an unprecedented capability for incorporating high-end simulation and data handling into the scientists' working environment where it can be combined with local compute and data systems, and eventually with the experiments themselves," said Bill Johnston, head of Berkeley Lab's Distributed Systems Department and one of the architects of the DOE Science Grid. "NERSC provides DOE's Office of Science with its major tools for computational simulation and data analysis and storage, so this integration of the most capable computing facilities directly with the scientists' working environment is what will create new levels of scientific capability and productivity."
NERSC, which operates a 3,328-processor IBM supercomputer (currently the third most powerful computer on earth, according to the TOP500 List of Supercomputers), had originally planned to make its high-performance computing systems accessible via the DOE Science Grid by 2004. The collaboration announced today will allow a core group of NERSC's 2,100 users to begin accessing resources via the DOE Science Grid two years earlier than originally planned.
"We have been working closely with IBM since the installation of our IBM supercomputer in 2000. Because we have a common interest in advancing Grid technology, it made sense to work together," said Bill Kramer, who is in charge of NERSC's computer operations. "As DOE's flagship center for unclassified computing, making our resources more easily and more widely accessible via the Grid will enhance research across a broad spectrum of scientific disciplines."
In addition to the large IBM supercomputer system, Grid software will be integrated into NERSC's HPSS (High Performance Storage System) archival data storage system, which has a capacity of 1.3 petabytes and is managed using IBM servers. NERSC and IBM have a strong history of working together to bring new technology to bear on the most challenging scientific problem. For example, NERSC and IBM are two of the six development partners that created and improved the HPSS. NERSC also operates a 160-processor IBM Netfinity cluster computer system.
By the end of the year, all three of NERSC's IBM systems are expected to be on the Grid. To do this, IBM will develop its software to be compatible with Globus and other Grid software, and NERSC will then move the software into service. NERSC and IBM will also use the collaboration to identify areas where the Grid software can be improved.
About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high-performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, the NERSC Center serves more than 7,000 scientists at national laboratories and universities researching a wide range of problems in combustion, climate modeling, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.