NERSCPowering Scientific Discovery for 50 Years

Two Peas in a Pod: How NERSC & ESnet Grew Up Together

Two of the Department of Energy’s most scientifically productive user facilities share a 'super' history

May 15, 2014

By Jon Bashor


Al Trivelpiece

Over the last four decades, two of the Department of Energy’s (DOE) most scientifically productive user facilities -- NERSC and ESnet -- have matured into independent siblings who spent their early years together, grew up and moved on to flourish as self-standing individual entities, becoming internationally recognized leaders in their respective fields. And in early 2015, they’ll move back in together and pool their talents to further scientific discovery.

The 40-year story of the two organizations has a number of similarities to those of siblings growing up together. Often, when one made an improvement, the other one followed suit. And in a way, they were both sired by the same forward-looking person, Alvin Trivelpiece.

In 1973, Trivelpiece was head of the Controlled Thermonuclear Research program for the Atomic Energy Commission, solicited proposals for a computing center to advance fusion energy research. Twelve years later, as head of DOE’s Office of Energy Research, Trivelpiece recommended that DOE’s two networks supporting specific research programs be combined into a larger network to support a wider range of scientific programs.

Here’s a look at some of the complementary milestones NERSC and ESnet have achieved over the years.

The Early Years

The older of the two facilities, the National Energy Research Scientific Computing Center, better known as NERSC, first went into service in July 1974. Born as the Controlled Thermonuclear Research Computing Center (CTRCC), the facility began computing with a hand-me-down: a Control Data Corp. (CDC) 6600 computer (serial number 1), the first computer designed by Seymour Cray. The machine, no longer needed by the classified computer program at Lawrence Livermore National Laboratory (LLNL), was accessed via dial-up phone connections. The CTRCC was established to provide fusion energy researchers with similar computing resources as those used by the nation’s nuclear weapons complex.

The CDC 7600, featured on the cover of Energy and Technology Review in 1975.

Within a year, the new center had installed a brand new supercomputer, a CDC 7600. Fusion scientists at Oak Ridge National Laboratory, Los Alamos Scientific Laboratory, Princeton Plasma Physics Laboratory and General Atomics all had PDP-10 computers for local computing and communicating with the Livermore computer. Access was via 50-kilobit phone lines, controlled by a PDP-11, but routed through a PDP-8A as no CDC7600-PDP-10 interface was available and would take up to seven months to develop.

 In January 1976, CTRCC users got their first online access via four operator terminals at 10 characters per second, 20 dial-in ports at 30 characters per second and eight 120 character per second ports. This setup remained in place until September 1976, when a 50 kilobits-per-second network was developed to handle the traffic.

In late 1976, the CTRCC was renamed the National Magnetic Fusion Energy Computer Center (NMFECC) and the network became known as MFEnet, shorthand for the Magnetic Fusion Energy Network. (Read an early report on the origins of MFEnet.)

Over the next few years, the number of computing systems increased, with a Cray-1 supercomputer installed in 1978 and a new mass storage system for data implemented. In 1980, the staff moved into a new facility on the Lawrence Livermore campus. The center was now serving more than 1,300 users, with about one-fourth of them at universities and the remainder at national laboratories and other research institutions.

The 1980s

But the increased number of users took a toll on the landlines used to access the NMFECC, with lower reliability. In 1980, the decision was made to solicit bids for satellite links between LLNL and 23 remote user sites. The contract with American Satellite Corp. promised the elusive trifecta of service: faster, better and cheaper. When the link with Princeton went live in April 1981, data transfer speed increased from 50 Kbps to 112 Kbps, and the monthly cost dropped from $27,000 to $14,000. Conversely, reliability went up. (Read details in this article about the Princeton link.)

In 1986 the Energy Sciences Network (ESnet) was born when DOE combined operations of its HEPnet (High Energy Physics Network) with MFEnet.

 In 1983, the DOE’s Office of Energy Research asked the NMFECC to expand its services to include researchers in other fields. This resulted in 13 additional sites joining the network. In June of that year, 5 percent of the computing resources were immediately allocated, with requests exceeding available time by an order of magnitude. By the end of 1983, the center was serving 2,400 users.

A year later, everything was bigger. A new Cray X-MP computer was installed to accommodate the 3,500 users at 28 sites. And looking even farther afield, MFEnet staff met with their counterparts at computing facilities in Japan, England and West Germany, with an eye toward establishing international connections. In 1985, MFEnet established direct data transfer links with the Japanese Plasma Physics Institute in Nagoya.

1984 also saw the increasing role of networking in supporting research, which led DOE’s Office of Energy Research to determine that enhanced networking facilities were needed to improve access to supercomputing facilities and laboratories. It was recommended that MFEnet be combined with HEPnet, a similar network supporting High Energy Physics research.

In 1986, a formal proposal for creating the Energy Sciences Network was approved, and ESnet was born. Responsibility for operating the new network was assigned to NMEFCC. A five-ring logo was developed to reflect the five research offices supported by the new network. 

And just as the network organization was broadening its mission, NMFECC was doing the same. By 1987, nearly a third of the computing resources were going to projects other than fusion research: Basic Energy Sciences accounted for nearly 15 percent of the total allocations, health and environment 14 percent, high energy and nuclear physics 12 percent and applied math 2.5 percent.

In January 1988, ESnet began providing networking services. And in August, NMFECC added yet another supercomputer, a Cray-2. A snapshot of the user community that year shows that 65 percent of the center’s users were from government laboratories, 30 percent from universities and 5 percent from other laboratories.

The Cray-2 was nicknamed "Bubbles" for its unique cooling mechanism.

 A year later, ESnet deployed commercially supplied multiprotocol routers via T1 lines, which provided speeds of 1.5 Mbps -- just as Tim Berners-Lee invented the World Wide Web and networking was about to hit the big time. When the WWW debuted in 1990, there were 300,000 users, and the term “Internet” began to enter the popular lexicon. ARPANET, which helped start it all, shut down. Networking began to really take off in 1991 when the commercial restriction on Internet use is lifted.

The 1990s

Meanwhile, in 1990, an eight-processor Cray-2 -- the only one of its kind -- was delivered to NMFECC. Even with the additional hardware, which had a peak speed of 4 gigaflops, demand for cycles was four times what the center could provide. For a short time, NMFECC was home to the first and last Cray-2 machines ever built.

The Cray C90.

One day after the eight-processor Cray-2 arrived, the center was renamed the National Energy Research Supercomputer Center to reflect its broadened mission. Speakers at the April 3 event commemorating the center’s new identity included James Decker, acting head of DOE’s Office of Energy Research (since renamed the Office of Science).

By 1990, MFEnet was winding down, but it continued to link NERSC computers with smaller DEC VAX and DEC PDP computers at 30 sites. That year, MFEnet was phased out and replaced completely by ESnet. The new T1 links were 25 times faster than MFEnet’s 56 Kbps lines.

Reflecting the increasingly collaborative nature of research, ESnet went international in the early 1990s. Satellite links connected Japan and Germany to ESnet and fiber links to CERN and the U.K. went into service and video collaborations were introduced.

By the end of 1992, NERSC was home to a Cray X-MP, Cray Y-MP, two Cray-2 machines, a Cray C90 and the Cray T3D, as well as six data storage silos. In 1994, a 128-procesor Cray T3D arrived at NERSC as part of the High Performance Parallel Processing project. The machine was used in a lab-industry partnership to advance development of parallel codes. The T3D was upgraded to 256 processors in December.

Map of ESnet in 1994.

Building on early work by ESnet staff, NERSC made its debut on the World Wide Web in 1994 with the promise “we have made and will continue to make tremendous use of this exciting medium.” That same year, ESnet put T3 lines into production on its backbone, bringing peak data speed up to 45 Mbps. ESnet also became the first major network to convert to the new Asynchronous Transfer Mode technology. At the same time, the National Science Foundation shut down the network it had created to serve its HPC centers.

In April 1995, DOE asked LLNL and Berkeley Lab to submit proposals on the future operation of NERSC, of which ESnet is part. In November, DOE announced its decision to relocate NERSC to Berkeley Lab. Joining in the migration to Berkeley were ESnet and the Center for Computational Sciences and Engineering. Once the transplanted organizations arrived at Berkeley Lab in 1996, ESnet became a standalone department in the newly created Computing Sciences directorate, while NERSC became a Berkeley Lab scientific division.

Also in 1996, In keeping with the nature of the research it supports, ESnet joined the Internet Protocol Version 6 (IPv6) backbone research effort.

Through a series of upgrades, ESnet began providing OC12 service (622 Mbps) in 1998, to OC48 rate in 2001 up to 2.5 gigabits per second (billions of bits per second, or Gbps), with 10 Gbps in the highest-speed portion of the network by 2003.

In a 1998 demo, ESnet conducted a networking demonstration of the feasibility of “Differentiated Services” between Berkeley Lab and Argonne National Laboratory, in which two video streams were sent over the Internet. The priority marked stream moved at eight frames per second, while the standard version was transmitted at just one frame per second.

"Seaborg," an IBM SP-3 installed at NERSC in 1999.

Speaking of high performance, NERSC merged its two Cray T3E machines into a single 696-processor supercomputer in 1998. Using the system, named MCurie, and then a larger T3E still on the factory floor in Wisconsin, researchers at NERSC, Oak Ridge National Laboratory, Ames Laboratory, Pittsburgh Supercomputing Center and University of Bristol shared the 1998 Gordon Bell Prize for the fastest application – the first sustained teraflop/s performance of a scientific application.

 Also in 1998, NERSC provided data analysis and simulations for the Supernova Cosmology project, which concluded that the universe is continuing to expand at an accelerating rate; this project was named the top scientific breakthrough of the year by Science Magazine. This melding of computational science and cosmology sowed the seeds for more projects, establishing Berkeley Lab and NERSC as centers for this new field

In 1999, NERSC accepted its new IBM system SP-3, dubbed "Seaborg," in honor of Berkeley Lab Nobel Laureate Glenn Seaborg. The first phase was installed at the main Berkeley Lab site, where it remained in service while other systems were moved to the new facility in Oakland.

The New Millenium

In 2001, NERSC opened its new facility, the Oakland Scientific Facility, in downtown Oakland. It featured a larger machine room that was needed to accommodate the growing footprint of massively parallel systems. An expanded version of Seaborg, with 3,328 processors, went online with a peak performance of 5 teraflop/s, making it the world’s most powerful supercomputer for unclassified research. In 2003, Seaborg was doubled in size to 6,656 processors, giving it a peak performance of 10 teraflop/s.

ESnet map, 2005.

In 2005, ESnet rolled out its first MANs (metropolitan area networks), designed and built in the SF Bay Area, Chicago area (in partnership with Argonne National Lab and Fermilab), and on Long Island. All of the MANs were designed to provide dual connectivity to the sites at 20 to 30 Gb/s -- 10 to 50 times the current site bandwidths, depending on the site. The dual links provided backup connectivity should one fail.

ESnet and Internet2 — two of the nation’s leading networking organizations dedicated to research —announced a partnership in 2006 to deploy ESnet4 to initially operate on two dedicated 10 gigabit per second (Gbps) wavelengths on the new Internet2 nationwide infrastructure.

In 2007, NERSC’s newest machine, a 20,000-core Cray XT4 system, entered production. The supercomputer was named “Franklin” in honor of the first internationally recognized American scientist, Benjamin Franklin.

During this same time period, ESnet completed hardware installations for the nation’s first dynamic circuit network dedicated solely to scientific research, called the Science Data Network (SDN). This new network, deployed in 2008, comprised multiple 10-gigabit optical circuits. That same year, monthly Internet users topped 1 billion (or 1.5 billion, depending on which web site you read).

NERSC's High Performance Storage System can hold 59.9 petabytes of scientific data.

In 2009, ESnet rolled out its On-Demand Secure Circuit and Advance Reservation System (OSCARS) enabling network engineers and users to provision point-to-point dynamic circuits when and where they need them. Such reservable, dedicated bandwidth is useful for moving data in and out of NERSC’s High Performance Storage System (HPSS), which can now hold 59.9 petabytes of scientific data — equivalent to all the music, videos or photos that could be stored on approximately 523,414 iPod classics filled to capacity.

That same year, Berkeley Lab received $62 million to build the Advanced Networking Initiative (ANI), a 100 Gbps prototype network linking DOE supercomputing centers at Lawrence Berkeley, Argonne and Oak Ridge national labs.

2010 and Beyond

With the acceptance of the Cray XE6 “Hopper” system in 2011, NERSC put its first petaflop/s machine into production. Named in honor of American computer scientist Grace Murray Hopper, Hopper is the world’s fifth most powerful computer upon its debut.

The Cray XE6 Hopper system.

Energy Secretary Steven Chu, along with Berkeley Lab and UC leaders, broke ground on Berkeley Lab’s Computational Research and Theory (CRT) facility in 2012. When it opens in 2015, the CRT will be at the forefront of high-performance supercomputing research and will be DOE’s most efficient facility of its kind.

In November 2012, ESnet launched the world's fastest science network, serving the entire national laboratory system, its supercomputing centers and its major scientific instruments at 100 gigabits per second – 10 times faster than its previous generation network.That same month, ESnet, Infinera and Brocade conducted a successful demo of Software-Defined Networking.

In 2013, all network traffic flowing into and out of NERSC began moving at 100 Gbps—this includes everything from email to massive scientific datasets – with a connection to ESnet’s 100 Gbps backbone.

From a conference in Amsterdam in June 2013, ESnet joined in the first intercontinental demonstration moving data across a 100 Gbps trans-Atlantic link to Manhattan.

In December 2013, NERSC accepted its newest flagship system, “Edison,” a Cray XC-30 supercomputer with a peak performance of 2.5 petaflops.

In early 2014, ESnet partnered with four other networks in a sustained test of a 100 Gbps trans-Atlantic link, using up to 99.9 percent of the leased bandwidth. The test was seen as a step toward replacing the 15 separate 10 Gbps links ESnet currently uses to move data back and forth between Europe and North America.

About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.