NERSCPowering Scientific Discovery Since 1974

Superfacility

September 25, 2019

Mission Statement

The Superfacility concept is a framework for integrating experimental and observational instruments with computational and data facilities. Data produced by light sources, microscopes, telescopes and other devices can stream in real-time to large computing facilities where it can be analyzed, archived, curated, combined with simulation data and served to the science user community via powerful computing, storage and networking systems. Connected with high-speed programmable networking, this superfacility model is more than the sum of its parts, allowing for discoveries across data sets, institutions and domains and making data from one-of-a-kind facilities and experiments broadly accessible. 

The NERSC Superfacility project is designed to identify the technical and policy challenges in this concept for an HPC center. It coordinates and manages the work to address these challenges, in partnership with target science teams. It is designed to ensure that the solutions developed are widely useful (rather than one-off engagements) and that will scale to multiple user groups, and will be scalable for NERSC staff to support. 

Services in Development

Data Management and Sharing

We are working to develop and deploy tools that can be used to handle the large volumes of data generated by superfacility partners.

Data Transfer

  • Globus is our tool of choice for large data transfers. We have several optimized data transfer nodes that can access every file system at NERSC.
  • We are working to offer a new interface into HPSS that eliminates much of the difficulty of bundling and uploading files.
  • A command line tool to do parallel transfers between file systems at NERSC (including HPSS) has been deployed on NERSC systems.
  • Batch system integration of data movement is being explored

Data Discovery

  • The NERSC Data Dashboard lets you see where your data is on the Project file system.
  • A PI Dashboard is under development to allow PIs to address common issues (like permission drift) for the data they control

Data Sharing

  • Spin, a service platform for deploying science gateways, has been successfully deployed
  • Globus Sharing has been enabled for data on the Project file system

The Superfacility Demo Series (May 2020)

In May 2020 the Superfacility Project is holding a series of virtual demos of tools and utilities that have been developed to support the needs of experimental scientists at ESnet and NERSC. 

Date/Time Topic/Speaker Abstract Recording
May 6th 2020, noon PT

SENSE: Intelligent Network Services for Science Workflows

Xi Yang and the SENSE team

The Software-defined network for End-to-end Networked Science at Exascale (SENSE) is a model-based orchestration system which operates between the SDN layer controlling the individual networks/end-sites, and science workflow agents/middleware.  The SENSE system includes Network Resource Manager and End-Site Resource Manager components which enable advanced features in the areas of multi-resource integration, real time responsiveness, and workflow middleware interactions. 

The demonstration will show the status of ongoing work to integrate SENSE services with domain science workflows, such as those envisioned for DOE Superfacility operations.  A common vision for these integrations is the provisioning of SENSE Layer 2 and Layer 3 services based on knowledge of current and planned data transfers.  SENSE allows workflow middleware to redirect traffic at granularities ranging from a single flow, specific end-system, or an entire end-site onto the desired SENSE provisioned services.  The SENSE Layer 2 services provide deterministic end-to-end resource guarantees, including the network and Data Transfer Node (DTN) elements.  The SENSE Layer 3 service provides the mechanisms for directing desired traffic onto specific Layer 3 VPN (L3VPN) for policy and/or quality of service reasons. 

 Link to video

 

Link to slides

May 13th 2020, noon PT

Data Management Tools and Capabilities

Lisa Gerhardt and Annette Grenier 

The PI Dashboard is a web portal that will allow PIs to address many of the common permission issues that come up when dealing with shared files on the Community File System.

GHI is a new GPFS / HPSS interface that offers the benefits of a more familiar file system interface for HPSS. Often users want to store complex directory structures or large bundles in HPSS which can be difficult to do with the traditional HPSS access tool. GHI can be used to easily move data between HPSS and the GPFS file system with a few simple commands. 


NERSC has written several command line data transfer scripts to users integrate data transfers into their workflows. We'll do a brief demo of these scripts.

 Link to video

 

Link to slides

May 20th 2020, noon PT

Superfacility API: Automation for Complex Workflows at Scale

 

Gabor Torok, Cory Snavely, Bjoern Enders

 
The Superfacility API aims to enable the use of all NERSC resources through purely automated means using popular development tools and techniques. An evolution of its predecessor, NEWT, the newly-designed API adds features designed to support complex, distributed workflows such as placing future job reservations and registration of API callbacks for asynchronous processes. It will also allow users to offload tedious tasks such as large data movement via simple REST calls.

While the Superfacility API is designed for non-interactive use, this demonstration will use a Jupyter notebook to step through a working example that calls the API to conduct a simple workflow process. Discussion will include additional information on planned API endpoints and authentication methods.

 Link to video

Link to slides

May 27th 2020, noon PT

Docker Containers and Dark Matter An Overview Of the Spin Container Platform with Highlights from the LZ Experiment

Cory Snavely, Quentin Riffard, Tyler Anderson

Spin is a container-based platform at NERSC designed for deploying science gateways, workflow managers, databases, API endpoints, and other network services to support scientific projects. Spin leverages the portability, modularity, and speed of Docker containers to allow NERSC users to quickly deploy pre-built software images or design their own. The underlying Rancher orchestration system provides a secure, managed infrastructure with access to NERSC systems, storage, and networks.

One project making use of Spin as part of its engagement with the Superfacility project is the LZ Dark Matter Experiment, which is preparing to operate a 10-ton, liquid-xenon-based detector a mile underground at the Sanford Underground Research Facility (SURF) in South Dakota. The collaboration of some 250 scientists and 37research institutions is busily readying the detector and associated software and data systems.

Services that will run in Spin to support the LZExperiment range from databases to data transfer monitoring and have been exercised during mock data challenges. In this demonstration, NERSC staff will give an overview of the Spin platform and show how a simple service is created in a few seconds. LZ staff will then describe the science of dark matter detection and give an overview of their work in Spin so far, focusing on the Event Viewer, a science gateway that allows researchers to examine significant detector events.

 Link to video 

Link to slides 

June 3rd 2020, noon PT

Jupyter

 

Matthew Henderson (w. Shreyas Cholia and Rollin Thomas)

Large scale "Superfacility" type experimental science workflows require support for a unified, interactive, real-time platform that can manage a distributed set of resources connected to High Performance Computing  (HPC)  systems.  Here we demonstrate how the Jupyter platform plays a key role in this space - it provides the ease-of-use and interactivity of a web science gateway while providing scientists the ability to build custom, ad-hoc workflows in a composable way.  Using real-world use cases from the National Center for Electron Microscopy (NCEM) we show how Jupyter facilitates interactive analysis of data at scale on NERSC HPC resources.

Jupyter Notebooks combine live executable code cells, with inline documentation and embedded interactive visualizations. This allows us to capture an experiment in a fully contained executable Notebook that is self-documenting and incorporates live rendering of outputs and results as they are generated. The Notebook format lends itself to a highly modular and composable workflow, where individual steps and parameters can be adjusted on the fly. Additionally, the Jupyter platform can support custom applications and extensions that live alongside the core Notebook interface.

We will use real world science examples to show how we create an improved interactive HPC experience in Jupyter including:
- Improvements to the NERSC JupyterHub Deployment
- Scaling up code in a Jupyter notebook to run on HPC resources through the use of parallel task execution frameworks
- Demonstrating the use of the Dask task framework as a backend to manage workers from Jupyter
- Enabling project-wide workflows and collaboration through sharing and cloning  Notebooks, and their associated software environments
We will also discuss related projects and potential future directions.
 

 Link to video 

Link to slides

 

 

Recent papers related to the Superfacility Model

Recent articles about the Superfacility Model

Recent talks about the Superfacility model