BERKELEY LAB RESEARCHERS EXPLORE ENERGY EFFICIENT COMPUTING, FROM SYSTEM DESIGNS TO APPLICATIONS
March 1, 2008
NERSC and other Berkeley Lab researchers are taking on energy efficiency research that aims to influence the computing industry in designing and building computer and storage technologies that will benefit the scientists, consumers and the environment.
Thanks to the Laboratory Directed Research and Development Program (LDRD) at Berkeley Lab, researchers are exploring subjects in computer architectures, algorithms and mass storage system designs to improve energy efficiency of scientific computations. The LDRD program provides special funding for promis- ing research projects, and this set of related LDRD projects includes researchers from NERSC and the Computational Research Division (CRD) at Berkeley Lab.
Pursuing energy efficient computing research at NERSC makes sense, both to save energy and to shift NERSC resources away from energy costs and towards systems and services that directly benefit the scientific community. With over 3,000 users and an insatiable demand for NERSC computing and storage facilities, results from the LDRD projects could indirectly benefit scientists across scientific disciplines that rely on computing. This includes cosmology, climate, life sciences, accelerator physics, fusion, computer science and material science.
Kathy Yelick, the NERSC Director, explains the importance of this work. “Power is the most important problem in computing today, not just at the high end, but from hand-held devices and laptops to data centers and computing centers like NERSC. Power density within chips has forced the entire processor industry to put multiple cores on a chip, and within centers the total system power is a major component of cost and availability.”
She describes this set of LDRDs as a “multi-faceted attack” on the problem, starting with a blank slate on the architecture end, and rethinking algorithms, applications and software to make use of energy- efficient hardware. The first goal is to do more science with less energy, and the second is to enable the next generation of exascale computing systems, which require technological breakthroughs to address the power issues at such extreme scales.
One of the projects takes a vertical slice through the problem space, looking at a single application domain and considering alternative algorithms as well as architectures for solving the problem. Climate modeling is the target application, selected because of its significance to science and the general public, and because it requires millions of CPU computing hours to explore various climate scenarios and the possible impacts of changes in policy or alternate fuel sources. The other LDRD projects take a broader look at specific aspects of the problem, including energy-efficient computing components based on multicore technology, energy efficient storage systems, and application characterization that explores the ability of various key algorithms to adapt to energy-efficient hardware.
Here are the four LDRD projects:
Climate Modeling System
The development of multicore chips has been the computer industry’s solution to keep power consumption in check. But some of the current approaches to adding more complex cores per chip would eventually hit a performance plateau. John Shalf, Lenny Oliker and Michael Wehner are investigating an alternative to using conventional microprocessors in designing energy-efficient supercomputers that employ more aggressive use of parallelism and design techniques from the consumer electronics industry to more closely tailor the chip design to the needs of scientific applications. They anticipate this approach could achieve 100 times or more improvement in power efficiency and effective performance over business as usual.
Their Climate Simulator Project will build a prototype system using embedded processors — low-power chips commonly found in consumer electronics devices such as cell phones and portable music players — and tailor its performance to provide optimal power efficiency for climate modeling problems. The group has adopted battery-powered designs in consumer electronics that are very sensitive to power consumption and cost. These embedded chips are less powerful and less power hungry than conventional microprocessors in supercomputers these days, and can be more easily customized to run specific applications. A computer built with thousands of these embedded chips could extract the most energy-efficient performance that also is capable of tackling complex scientific problems. The approach not only promises to be more power-efficient than the conventional path forward in HPC — it also promises to be more cost-effective.
The Climate Simulator team is working with Tensilica, an embedded processor design firm, as well as David Randall, a professor in the Atmospheric Sciences Department at Colorado State University and a NERSC user. Randall’s climate modeling code, developed under DOE’s SciDAC program, is a new breed of cli- mate modeling code that is capable of expressing enough parallelism to run kilometer-scale simulations 1000 times faster than real time on machines envisioned by the Climate Simulator research team. In order to enable dramatic changes in power efficiency, codes such as Randall’s must expose orders of magnitude more parallelism than the current climate mod- els. Employing simpler processors that are designed for parallel throughput rather than serial performance will enable substantial power efficiency gains.
“We want to find compelling solutions to scientific problems that need petascale machines,” Shalf said. “The use of these power-efficient cores will help us achieve those goals.”
The development of multicore chips represents the most significant shift in microprocessor engineering in several decades, and it opens up opportunities for exploring innovative designs for high- performance computers.
Jonathan Carter, head of the User Services Group at NERSC, is leading the project to explore a wide range of multi- core computer architectures and how efficiently those systems can perform on challenging scientific codes. The project, “Enhancing the Effectiveness of Manycore Chip Technologies for High- End Computing,” also includes collabora- tors Lenny Oliker and John Shalf.
Future supercomputers will likely be built with chips containing an increasing number of cores. Multicore chip designs vary, however. They include heterogeneous designs, such as the Cell processor, developed by IBM, Sony and Toshiba; graphics processing units (GPUs); and processors for the embedded market. There are also homogeneous designs, such as microprocessors by Intel and Advanced Micro Devices, the world’s two largest chip makers. In many cases, multicore technologies offer higher absolute performance and more energy-efficient computation.
“This LDRD project provides a breadth of architecture coverage to our whole ultra-efficient research thrust. We want to identify candidate algorithms that map well to multicore technologies, and document the steps needed to re-engi- neer programs to take advantage of these architectures,” Carter said. “In addition, perhaps there are design elements in multicore chips that we can influence to help design a better high- performance system.”
Led by CRD scientists Ekow Otoo and Doron Rotem, the “Energy Smart Disk- Based Mass Storage System” project sets out to investigate energy-efficient disk storage configurations that also provide quick access to massive amounts of data.
Today’s storage systems in data centers use thousands of continuously spinning disk drives. These disk drives and the necessary cooling components use a substantial fraction of the total energy consumed by the data center. As the need for reliable long-term storage of data grows, so will the associated energy costs.
Otoo and Rotem have set out to explore new configurations that divide the disks into active and passive groups. The active group contains continuously spinning disks and acts as a cache for most frequently accessed data. The disks in the passive group would power down after a period of inactivity. Besides looking at optimal disk configurations and file placement algo- rithms, the researchers will also develop simulation models for analyzing energy use.
Benchmarking for Dwarfs
A project led by Erich Strohmaier proposes to develop a test bed for bench- marking of key algorithms that will be crucial for designing software and computers that use processors with many cores on each chip. This project is conducted with domain experts from CRD, NERSC, and UC Berkeley.
From desktop PCs to supercomputers, systems built with hundreds of cores or more are likely to hit the market as early as the beginning of the next decade. The scientific community and the computer industry need to figure out how to make efficient use of these more powerful machines. This will be especially impor- tant for high end users, as they will face systems with large differences in intercon- nect properties at different levels of the system architecture hierarchy.
The project, “Reference Benchmarks for the Dwarfs,” will devise ways to use a set of algorithms to gauge the performance of systems from personal computers to high- performance systems. The algorithms are known as dwarfs; each dwarf represents a class of algorithms with similar properties and behavior. The 13 dwarfs chosen for the research include algorithms important for the scientific community.
About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high-performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, the NERSC Center serves more than 7,000 scientists at national laboratories and universities researching a wide range of problems in combustion, climate modeling, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.