NERSCPowering Scientific Discovery Since 1974

Are Earths Rare? Perhaps Not

Developed at NERSC, a Pipeline for Finding Earth-like Planets in the Milky Way

January 13, 2014

Contact: Linda Vu, +1 510 495 2402,


Artist’s representation of the “habitable zone,” the range of orbits where liquid water is permitted on the surface of a planet. The authors find that 22% of Sun-like stars harbor a planet between one and two times the size of Earth in the habitable zone

One out of every five sun-like stars in our Milky Way galaxy has an Earth-sized planet orbiting it in the Goldilocks zone—not too hot, not too cold—where surface temperatures should be compatible with liquid water, according to a statistical analysis of data from NASA’s Kepler spacecraft by Erik Petigura, a graduate student at the University of California, Berkeley (UC Berkeley).

Petigura and his colleague Andrew Howard, now at the University of Hawaii, Manoa, spent three years developing a transit search pipeline called TERRA that is optimized for finding small planets. When they used this tool on supercomputers at the Department of Energy’s (DOE’s) National Energy Research Scientific Computing Center (NERSC) to analyze nearly four years of Kepler observations, the scientists determined that our galaxy could contain as many as 40 billion habitable Earth-sized planets.

“NERSC resources were crucial to getting this result. It just would not have been possible to do this work without supercomputers,” says Petigura. “Because our software removes systematic noise in an efficient and robust fashion, it also has the potential to perform very high quality analysis of future wide-field digital data missions like The Dark Energy Survey and Large Synoptic Survey Telescope, which DOE is heavily invested in.”

“Erik’s pipeline will be very useful in helping us determine our detection efficiency—or how well our supernovae hunting pipelines work—especially because our Palomar Transient Factory survey observed the same Kepler fields,” says Peter Nugent, who co-leads the Computational Cosmology Center at the Lawrence Berkeley National Laboratory.

Petigura, Howard and Geoffrey Marcy, UC Berkeley professor of astronomy, published their findings in the Proceedings of the National Academy of Sciences.

A Census of Extrasolar Planets with Supercomputers

In 2009, NASA launched the Kepler Space Telescope to search for “transit planets”—planets that cross in front of their host stars—outside of our solar system. To track this movement, the telescope took brightness measurements or “photometry” of the same 150,000 stars, every 30 minutes, for nearly four years. The systems with planets show periodic dimming as the planets orbit their host star. 

According to Petigura, researchers can learn a lot about a planet based on this observation. The period of dimming reveals the planet’s orbital period. Armed with this information, researchers use Kepler’s Third Law of planetary motion—which relates orbital period with planetary distance—to determine how far the planet sits from its host star and its warmth. Astronomers can also infer planet size from the depth of the dimming, as larger planets tend to cast bigger shadows.

To make the search for Earth-like planets a little easier, Petigura and his colleagues spent three years optimizing computer algorithms and developing the TERRA pipeline to remove systematic errors from Kepler data, search for small planets and verify that the findings are in fact planets and not false positives.

“In a string of data that is four years long, there are many ways to hide a small planet,” says Petigura. “Because we look at brightness as a function of time, we need to wade through four years of photometry to get information about a planet’s orbital period, time of transit and transit duration. These events materialize in our data a number of ways, and to do a thorough search we need to evaluate about 10 billion different combinations for each individual star.”

He notes that to run a complete analysis of a single star on a single processor would take about 30 minutes. But with access to NERSC’s Carver system, the researchers were able to significantly speed up this process. They parallelized their code—simultaneously running stellar analyses on hundreds of processors—and managed to analyze thousands of stars within an hour or two. By the end of their survey, the team analyzed about 100,000 stars with the pipeline.

“This speed up was huge for us,” says Petigura. “It meant that we could make changes to our code, optimize the pipeline, and get real-time feedback to see if those changes were hurting or helping us.”

The team also subjected Petigura’s planet-finding algorithms to a battery of tests to see how many Earth-sized planets, located in the habitable-zone, the pipeline missed. This step distinguishes their analysis from previous analyses of Kepler data.

“We are essentially taking a census of extrasolar planets. If you are counting up everyone in the USA, you need to correct for the people who don’t answer the door, who are on vacation, or the ones you may have counted twice,” says Petigura. “In our work, correcting or debiasing the survey was critical for understanding the underlying distribution of planets.”

There are two main reasons why the pipeline may overlook a planet, he notes. If the orbital plane is tilted, the transit may not be detectable from Earth. Or, the planet’s transit may not be detectable above stellar nose in the data. To correct for these omissions, the team ran simulations to determine the dimming profiles of planets in these scenarios and then injected this data into the same analysis procedure to see how many planets the pipeline recovered.

“It’s kind of like when you are studying for a math test, you do a bunch of sample problems and then compare your answers to the ones in the back of the book to see how well you understand the subject,” says Petigura. 

Accounting for missed planets, as well as the fact that only a small fraction of planets are oriented so that they cross in front of their host star as seen from Earth, allowed the team to estimate that 22 percent of all sun-like stars in the galaxy have Earth-size planets in their habitable zones.

“If the stars in the Kepler field are representative of stars in the solar neighborhood, … then the nearest (Earth-size) planet is expected to orbit a star that is less than 12 light-years from Earth and can be seen by the unaided eye,” the researchers wrote in their paper. “Future instrumentation to image and take spectra of these Earths need only observe a few dozen nearby stars to detect a sample of Earth-size planets residing in the habitable zones of their host stars.”

“I’m really grateful for the time NERSC gave me to do this work. We used about 300 cores on Carver, which wasn’t a very large run in terms of CPU time and resources, but it allowed us to do a lot more than we could achieve on our machine,” says Petigura.

NASA retired the Kepler Space Telescope in August 2013.

More Information:
Astronomers answer key question: How common are habitable planets? (UC Berkeley Press Release)

About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high-performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, the NERSC Center serves more than 7,000 scientists at national laboratories and universities researching a wide range of problems in combustion, climate modeling, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. DOE Office of Science. »Learn more about computing sciences at Berkeley Lab.