NERSC Now Connecting All Science at 100Gbps

October 17, 2013

All network traffic flowing in-and-out of the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) is now moving at 100 Gigabits per second (Gbps)—this includes everything from email to massive scientific datasets. At this speed, 1.8 million people could simultaneously download an eBook in about two minutes.

The impetus for this move comes from an increasing demand for bandwidth from scientific users of the facility, and made possible because of an innovative redesign of NERSC’s border monitoring system (Bro) by engineers in the center’s Network and Security Team. Bro can now monitor paths and load-balance the100Gbps connection across an array of intrusion detection systems, then automatically terminate and block bad actors. NERSC is the first computing facility to implement a 100Gbps security system and this may serve as a model for other DOE facilities to do the same. 

“Before we could move all NERSC traffic to 100Gbps, we had to make sure that we could securely monitor all the data streaming in at this speed,” says Jason Lee, who leads NERSC’s Network and Security Team. “Security was our number one requirement, and until recently nobody knew how to do this.”

So Lee and Scott Campbell, also a network and security team member, set out to solve this problem. Their results were published in two papers “Prototyping a 100G monitoring system” and “Intrusion Detection at 100G”. The updated Bro system went into production at NERSC in July 2013. 

Science DMZ Increases Availability of NERSC Systems

Once NERSC decided to move all of its traffic to 100Gbps, Lee worked with engineers at the DOE’s Energy Sciences Network (ESnet) to set up a 100Gbps Science DMZ, which gives NERSC network engineers the ability to set up multiple private circuits using software-defined networking (SDN). With these tools, NERSC staff can help remote scientists who may see their data transfers slow down due to firewalls at their local campuses achieve a true 100Gbps end-to-end connection. Additionally, ESnet engineers also helped NERSC set up a system to announce their own address space. This allows the center to separately route traffic to any research or education (R&E) site or separate R&E traffic from the commodity Internet.

“As scientific instruments improve with time, the computational needs of scientists also change and become more complex. This Science DMZ infrastructure allows us to explore new ways of accommodating the needs of science, without impacting NERSC’s existing network traffic,” says Lee. “This in-turn allows us to plan for the future, while ensuring availability of resources for current users.”

ESnet is DOE’s high-performance national scientific research network. It connects more than 40 sites including the National Laboratories, their supercomputing facilities (including NERSC) and major scientific instruments. The network also connects to 140 research and commercial networks around the world. ESnet officially launched its nationwide 100Gbps network in 2012.

NERSC was one of the first sites to get a 100Gbps connection in November 2011, when the facility partnered with ESnet to show the capability of a 100Gbps network. During the SC11 conference, scientists ran a simulation depicting how the Universe has changed over 13.7 billion years. Researchers ran the Nyx code on 4,096 cores of NERSC’s Cray XE6 “Hopper” system, which produced more than 5 terabytes of data. This information was transferred in real-time to a booth on the SC11 exhibit floor, at the Washington State Convention Center in Seattle.  Over the next two-years, NERSC maintained 100Gpbs connections with data transfer nodes at DOE Leadership Computing Facilities in Oak Ridge, Tenn and Argonne, Ill.

“The recent Bro redesign allows us to increase network bandwidth by a factor of 10 for all NERSC users, not just special dedicated services,” says Brent Draney, who leads NERSC’s Networking, Servers and Security Group. “This capability not only allows us to better serve our current users, but lays the foundation for us to scale up as growing scientific datasets lead to an increasing demand for bandwidth.”

