Deep Learning at 15 PFlops Enables Training for Extreme Weather Identification at Scale

March 29, 2018

By Rob Farber

Petaflop per second deep learning training performance on the Cori supercomputer at Berkeley Lab’s National Energy Research Scientific Computing Center (NERSC) has given climate scientists the ability to use machine learning to identify extreme weather events in huge climate simulation datasets. Predictive accuracies ranging from 89.4 percent to as high as 99.1 percent show that trained deep learning neural networks (DNNs) can identify weather fronts, tropical cyclones, and long narrow air flows that transport water vapor from the tropics called atmospheric rivers (Figure 1).

Figure 1: Relation between ground truth (green boxes) and classification plus regression results (red boxes) of the DNN trained to recognize atmospheric phenomena.

The strong relationship between ground truth and the neural network prediction can be seen in the classification plus regression results reported by Berkeley Lab climate scientist Michael Wehner at the Intel Developer Conference held during SC17 last November in Denver, Colorado.

Supercomputers like NERSC’s Cori system provide scientists with an extraordinary tool to model climate change significantly faster and far more accurately than was possible on previous generation supercomputers. For example, simulated storms can run 300x to 10,000x faster than real time, according to Wehner in his presentation at the Intel conference. This meets the needs of climate scientists who need to run many-century long simulations to evaluate the impact of climate change far into the future.

Figure 2: Comparative results showing the additional detail that is modeled by a 25 Km spatial resolution model as opposed to a 200 Km model.

While humans can (and do) perform well in identifying and tracking extreme weather events in real time, they simply cannot keep up when climate models run two to five orders of magnitude faster, he added. Thus machine learning has to be used to identify and track extreme weather events. Further, these machine learning based results can be used to validate the climate models so we have confidence in the future predictions of these models.

Figure 2 represents an assessment of what will happen to the number and intensity of hurricanes as the climate warms.

According to Prabhat, who leads NERSC’s Data and Analytics Services group and is director the Big Data Center at NERSC, identifying phenomena in climate data is analogous to commercial vision applications (Figure 3). In his Intel Developer Conference keynote presentation, he noted that initial supervised training results show that this analogy is correct because machine learning was able to train and recognize each of three desired atmospheric phenomena with high accuracy.

Researchers from MILA, NERSC and Microsoft teamed up to create a novel semi-supervised convolutional DNN architecture that was able to do the work of all three individual supervised DNNs at the same time. Essentially, this novel neural network finds the bounding box size and location when it classifies the atmospheric phenomena. Further, the neural network also associates a probability with the classification.

Figure 3. Intuition showing that conventional DNNs could potentially recognize atmospheric phenomena

Powerful leadership class supercomputers like the Cori supercomputer have made fast, accurate global climate simulations possible. Innovations such as the petascale capable hybrid machine learning technique pioneered by Intel, NERSC and Stanford means those same machines can also train DNNs to evaluate the tens to hundreds of terabytes of data created by these faster and more accurate climate simulations.

This article originally appeared on HPCWire on March 19, 2018. Reprinted with permission.

About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.