Deep Learning at 15 PFlops Enables Training for Extreme Weather Identification at Scale
March 29, 2018
Contact: Rob Farber, email@example.com
Reprinted with permission from HPCWire.
Petaflop per second deep learning training performance on the Cori supercomputer at Berkeley Lab’s National Energy Research Scientific Computing Center (NERSC) has given climate scientists the ability to use machine learning to identify extreme weather events in huge climate simulation datasets. Predictive accuracies ranging from 89.4% to as high as 99.1% show that trained deep learning neural networks (DNNs) can identify weather fronts, tropical cyclones, and long narrow air flows that transport water vapor from the tropics called atmospheric rivers (Figure 1).
The strong relationship between ground truth and the neural network prediction can be seen in the classification plus regression results reported by Berkeley Lab climate scientist Michael Wehner at the Intel Developer Conference held during SC17 last November in Denver, Colorado.
Supercomputers like NERSC’s Cori system provide scientists with an extraordinary tool to model climate change significantly faster and far more accurately than was possible on previous generation supercomputers. For example, simulated storms can run 300x to 10,000x faster than real time, according to Wehner in his presentation at the Intel conference. This meets the needs of climate scientists who need to run many-century long simulations to evaluate the impact of climate change far into the future.
While humans can (and do) perform well in identifying and tracking extreme weather events in real time, they simply cannot keep up when climate models run two to five orders of magnitude faster, he added. Thus machine learning has to be used to identify and track extreme weather events. Further, these machine learning based results can be used to validate the climate models so we have confidence in the future predictions of these models.
Figure 2 represents an assessment of what will happen to the number and intensity of hurricanes as the climate warms.
According to Prabhat, who leads NERSC’s Data and Analytics Services group and is director the Big Data Center at NERSC, identifying phenomena in climate data is analogous to commercial vision applications (Figure 3). In his Intel Developer Conference keynote presentation, he noted that initial supervised training results show that this analogy is correct because machine learning was able to train and recognize each of three desired atmospheric phenomena with high accuracy.
Researchers from MILA, NERSC and Microsoft teamed up to create a novel semi-supervised convolutional DNN architecture that was able to do the work of all three individual supervised DNNs at the same time. Essentially, this novel neural network finds the bounding box size and location when it classifies the atmospheric phenomena. Further, the neural network also associates a probability with the classification.
Powerful leadership class supercomputers like the Cori supercomputer have made fast, accurate global climate simulations possible. Innovations such as the petascale capable hybrid machine learning technique pioneered by Intel, NERSC and Stanford means those same machines can also train DNNs to evaluate the tens to hundreds of terabytes of data created by these faster and more accurate climate simulations.
Click here for the full version of this article, which originally appeared on HPCWire on March 19, 2018.
About the Author: Rob Farber is a global technology consultant and author with an extensive background in HPC and in developing machine learning technology that he applies at national labs and commercial organizations.
About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high-performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, the NERSC Center serves more than 6,000 scientists at national laboratories and universities researching a wide range of problems in combustion, climate modeling, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. DOE Office of Science. »Learn more about computing sciences at Berkeley Lab.