Berkeley Lab Physicists Apply Machine Learning to the Universe’s Mysteries

January 30, 2018

By Glenn Roberts Jr.
Contact: cscomms@lbl.gov

Computers can beat chess champions, simulate star explosions, and forecast global climate. We are even teaching them to be infallible problem-solvers and fast learners.

And now, physicists at the Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab) and their collaborators have demonstrated that computers are ready to tackle the universe’s greatest mysteries. The team fed thousands of images from simulated high-energy particle collisions to train computer networks to identify important features.

The researchers programmed powerful arrays known as neural networks to serve as a sort of hivelike digital brain in analyzing and interpreting the images of the simulated particle debris left over from the collisions. During this test run the researchers found that the neural networks had up to a 95 percent success rate in recognizing important features in a sampling of about 18,000 images.

The study was published Jan. 15 in the journal Nature Communications.

The next step will be to apply the same machine learning process to actual experimental data.

Powerful machine learning algorithms allow these networks to improve in their analysis as they process more images. The underlying technology is used in facial recognition and other types of image-based object recognition applications.

Recreating Subatomic Particle 'Hot Soup'

The images used in this study – relevant to particle-collider nuclear physics experiments at Brookhaven National Laboratory’s Relativistic Heavy Ion Collider and CERN’s Large Hadron Collider – recreate the conditions of a subatomic particle “soup,” which is a superhot fluid state known as the quark-gluon plasma believed to exist just millionths of a second after the birth of the universe. Berkeley Lab physicists participate in experiments at both of these sites.

“We are trying to learn about the most important properties of the quark-gluon plasma,” said Xin-Nian Wang, a nuclear physicist in the Nuclear Science Division at Berkeley Lab who is a member of the team. Some of these properties are so short-lived and occur at such tiny scales that they remain shrouded in mystery.

In experiments, nuclear physicists use particle colliders to smash together heavy nuclei, like gold or lead atoms that are stripped of electrons. These collisions are believed to liberate particles inside the atoms’ nuclei, forming a fleeting, subatomic-scale fireball that breaks down even protons and neutrons into a free-floating form of their typically bound-up building blocks: quarks and gluons.

Researchers hope that by learning the precise conditions under which this quark-gluon plasma forms, such as how much energy is packed in, and its temperature and pressure as it transitions into a fluid state, they will gain new insights about its component particles of matter and their properties, and about the universe’s formative stages.

But exacting measurements of these properties – the so-called “equation of state” involved as matter changes from one phase to another in these collisions – have proven challenging. The initial conditions in the experiments can influence the outcome, so it’s challenging to extract equation-of-state measurements that are independent of these conditions.

“In the nuclear physics community, the holy grail is to see phase transitions in these high-energy interactions, and then determine the equation of state from the experimental data,” Wang said. “This is the most important property of the quark-gluon plasma we have yet to learn from experiments.”

Researchers also seek insight about the fundamental forces that govern the interactions between quarks and gluons, what physicists refer to as quantum chromodynamics.

Deep Convolutional Neural Networks

Long-Gang Pang, the lead author of the latest study and a Berkeley Lab-affiliated postdoctoral researcher at UC Berkeley, said that in 2016, while he was a postdoctoral fellow at the Frankfurt Institute for Advanced Studies, he became interested in the potential for artificial intelligence (AI) to help solve challenging science problems.

He saw that one form of AI, known as a deep convolutional neural network – with architecture inspired by the image-handling processes in animal brains – appeared to be a good fit for analyzing science-related images.

“These networks can recognize patterns and evaluate board positions and selected movements in the game of Go,” Pang said. “We thought, ‘If we have some visual scientific data, maybe we can get an abstract concept or valuable physical information from this.’”

Wang added, “With this type of machine learning, we are trying to identify a certain pattern or correlation of patterns that is a unique signature of the equation of state.” So after training, the network can pinpoint on its own the portions of and correlations in an image, if any exist, that are most relevant to the problem scientists are trying to solve.

Accumulation of data needed for the analysis can be very computationally intensive, Pang said, and in some cases it took about a full day of computing time to create just one image. When researchers employed an array of GPUs that work in parallel – GPUs are graphics processing units that were first created to enhance video game effects and have since exploded into a variety of uses – they cut that time down to about 20 minutes per image.

They used computing resources at Berkeley Lab’s National Energy Research Scientific Computing Center (NERSC) in their study, with most of the computing work focused at GPU clusters at GSI in Germany and Central China Normal University in China. The team used three kinds of calculations, according to Pang. In the first one, they used a GPU cluster in Germany to do relativistic hydrodynamic simulations of heavy ion collisions. In the second one, they used tensorflow/keras on 2 Nvidia GPUs from Central China Normal University to train the convolution neural network and fully connected neural networks.

“We did the third part calculations at NERSC, using Jupyter Notebook and scikit-learn on Cori to do classifications with many traditional machine learning tools, such as SVM/Bayes/DecisionTrees/RandomForests/GradientBoostingTrees,” Pang said. “The calculations at NERSC were meant to be compared with the deep learning method.”

A benefit of using sophisticated neural networks, the researchers noted, is that they can identify features that weren’t even sought in the initial experiment, like finding a needle in a haystack when you weren’t even looking for it. And they can extract useful details even from fuzzy images.

“Even if you have low resolution, you can still get some important information,” Pang said.

Discussions are already underway to apply the machine learning tools to data from actual heavy-ion collision experiments, and the simulated results should be helpful in training neural networks to interpret the real data.

“There will be many applications for this in high-energy particle physics,” Wang said, beyond particle-collider experiments.

Also participating in the study were Kai Zhou, Nan Su, Hannah Petersen, and Horst Stocker from the following institutions: Frankfurt Institute for Advanced Studies, Goethe University, GSI Helmholtzzentrum für Schwerionenforschung (GSI), and Central China Normal University. The work was supported by the U.S Department of Energy’s Office of Science, the National Science Foundation, the Helmholtz Association, GSI, SAMSON AG, Goethe University, the National Natural Science Foundation of China, the Major State Basic Research Development Program in China, and the Helmholtz International Center for the Facility for Antiproton and Ion Research.

NERSC is DOE Office of Science user facility.

About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.