NERSCPowering Scientific Discovery Since 1974

2023 NERSC Summer Internship Projects

Data and Analytics


Accelerating real-time data processing for the DIII-D fusion experiment

Science/CS Domain(s)

magnetic confinement fusion, data analysis, code optimization

Project Description

The goal of this project is to speed up the charge exchange data processing for the DIII-D tokamak at NERSC. The charge-exchange data is required to run a between-shot workflow that can reconstruct the DIII-D plasma profiles every ~10 minutes; this workflow cannot start until the charge-exchange data analysis is complete. We will profile the current charge-exchange data analysis code and determine opportunities for achieving speedup, likely through MPI parallelization and/or code optimization. We will implement these improvements and check to ensure that the overall application correctness is preserved. Once we have achieved some speedup, we will examine the full real-time equilibrium reconstruction workflow, including the cost of transferring the raw input data from DIII-D to NERSC. We will evaluate whether the full workflow is faster with NERSC-based data processing or DIII-D-based data processing. In either case, achieving speedup will be critical to achieving between-shot equilibrium reconstruction.

Desired Skills/Background

  • Experience with C++ and/or FORTRAN and MPI
  • Nice to have: code profiling, code optimization, writing and running unit tests for correctness

NERSC/DAS Mentor(s)

Laurie Stephey (lastephey@lbl.gov)


Scalable Deployment of Data Services with Helm

Science/CS Domain(s)

DevOps, Backend Development, Data Management

Project Description

Linux containers have become an immensely popular software development paradigm due to their ability to provide lightweight and reproducible software runtime encapsulation as well as extremely portable and scalable application deployment. At NERSC, the Data & Analytics Services Team uses containers to deploy a variety of data services to over 8000 active users. The goal of this project is to migrate NERSC science database and data portal services from monolithic containers to helm-templated microservices. This will enable several improvements, including easier and more frequent version upgrades and the possibility for users to self-administer these services on NERSC’s Kubernetes-as-a-Service platform, Spin.

Desired Skills/Background

  • Some scripting or programming experience and experience with or interest in using container technology (e.g., Docker, Podman).
  • Nice to have: Any experience with Kubernetes, Helm, CI/CD in Gitlab/Github, web server configuration (NGINX/Apache), databases (Mongo/Postgres/MySQL), web app or microservice development.

NERSC/DAS Mentor(s)

Dan Fulton


Machine Learning


Deep learning for climate simulations

Science/CS Domain(s)

scientific machine learning, HPC, weather/climate

Project Description

Simulating the earth’s climate with high fidelity in high resolution requires significant computational resources. Today, with the advent of deep learning models and availability of large volumes of simulation and observational data, data-driven models have the enormous potential to augment traditional numerical models by providing orders of magnitude speedup in compute and, hence, enabling the use of massive ensembles to predict low likelihood and high impact extreme events under different climate warming scenarios. In this project, we aim to use state-of-the-art Fourier forecasting networks (based on Transformers) and HPC software to understand and characterize the performance of deep learning models in simulating the physical behavior of Earth’s atmospheric processes and associated extreme events. This will involve exploration of climate simulation data, development of model architectures and underlying foundational aspects of such models, and using NERSC HPC compute resources on the Perlmutter supercomputer to train and analyze these large models.

Desired Skills/Background

  • Python, deep learning, PyTorch/TensorFlow, interest in climate science
  • Nice to have: Distributed training of ML models

NERSC/DAS Mentor(s)

Shashank Subramanian, Peter Harrington


Accelerating Diffusion Models in High Energy Physics

Science/CS Domain(s)

Scientific Machine learning, HPC, HEP/Collider Physics

Project Description

Generative models are widely used in High Energy Physics as surrogate models whose aim is to replace computationally expensive simulation routines. Recently, diffusion models have been proposed as powerful generative models, achieving state-of-the-art performance in computer vision and encouraging performance for collider physics applications. One of the main challenges for these models is the time it takes to generate new observations. In this project, you are going to explore new strategies to accelerate the sampling time of diffusion models. These include the investigation of distillation models and fast ODE solvers, comparing the speed up times and the physics performance achieved by these strategies.

Desired Skills/Background

  • Experience with Python, Tensorflow/PyTorch, interest in collider physics and diffusion models
  • Nice to have: Distributed training, physics background

NERSC/DAS Mentor(s)

Vinicius Mikuni (vmikuni@lbl.gov)


Build a platform for supercomputer-powered scientific AI workflows

Science/CS Domain(s)

AI+HPC, software engineering, MLOps, distributed ML

Project Description

As applications of AI in science grow and mature, researchers increasingly need advanced tools to help leverage the power of massive supercomputers. Their research often requires sophisticated parallel workflows including distributed model training, hyperparameter optimization (HPO), and/or inference on massive datasets. Such workflows can be cumbersome to implement and manage by hand, slowing research and scientific progress. As an intern on this project, you will develop the tools and services to enable reproducible, scalable, automated AI workflows on NERSC supercomputers with intuitive, interactive interfaces. This platform for scientific AI research will fuel the next generation of AI-driven scientific discoveries.

Desired Skills/Background

  • Required: Experience with python, either ML software or HPC workflows, interest in AI+HPC
  • Nice to have: experience with distributed ML, libraries like PyTorch, TensorFlow, Jupyter, W&B, MLflow, ClearML

NERSC/DAS Mentor(s)

Steven Farrell (sfarrell@lbl.gov), Peter Harrington (pharrington@lbl.gov)


Scientific AI benchmarking and performance analysis

Science/CS Domain(s)

Deep learning, benchmarking, performance optimization

Project Description

Scientific AI/ML/DL applications are a transformative emerging workload for supercomputers, and it is critical for HPC centers to have robust methodologies and benchmarks for characterizing these new workloads, evaluating system performance, and driving innovation in hardware and system design. MLPerf benchmarks and related efforts are pushing on this front with state-of-the-art applications and performance measurements for HPC science. We are looking for an enthusiastic intern to optimize and analyze the performance of scientific AI benchmarks at scale on the Perlmutter supercomputer, a powerful system featuring over 6,000 NVIDIA A100 GPUs which debuted as the #5 system on the Top500 in 2021 and had leading results on MLPerf HPC v1.0. The intern will have the opportunity to work on a variety of potential tasks, including training models at scale, code profiling and optimization, power measurement and analysis, hyperparameter optimization, and researching new ideas to accelerate and scale AI model training.

Desired Skills/Background

  • Required: Python, machine learning, experience with PyTorch or TensorFlow
  • Nice to have: distributed deep learning, GPU profiling, hyperparameter optimization, model parallelism

NERSC/DAS Mentor(s)

Steven Farrell (sfarrell@lbl.gov), Neil Mehta (neilmehta@lbl.gov)


Jupyter


Enhancing Jupyter Capabilities and Infrastructure at NERSC

CS Domain(s)

Sofware Engineering, Data Engineering

Project Description

Scientists love Jupyter because it combines text, visualization, data analytics, and code into a document they can share, modify, and even publish. What about using Jupyter to control experiments in real time, steer complex simulations on a supercomputer, or even combine aspects of both workflows? How do users make use of Jupyter on a supercomputer? We are looking for Python and Jupyter enthusiasts to help us find ways to expose NERSC’s high performance computing and storage systems through Jupyter, making supercomputing more literate and more user-friendly. This project will involve analyzing Jupyter activity at NERSC and developing software tools to automate analysis of Jupyter usage to gain insight into how users interact with our systems through Jupyter and how we can improve the performance and stability of Jupyter on the Perlmutter supercomputer.

Desired Skills/Background

Python, Jupyter

NERSC Mentor(s)

Rollin Thomas, Kelly Rowland, Shreyas Cholia


Programming Environment and Models


Porting Lulesh benchmark to Asynchronous Many Tasks Runtime

CS Domain(s)

Software engineering, high performance computing, performance optimization and analysis, programming models

Project Description

The Lulesh benchmark (Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics) is a widely used tool for evaluating the performance of parallel computing systems and has been ported to various programming models, including OpenMP, Kokkos, Cuda, etc. However, its current implementation is based on a data-parallel programming model, which potentially limits its ability to fully leverage the capabilities of modern parallel systems. The goal of this project is to explore porting Lulesh to an asynchronous many task (AMT) runtime plus GPU offload to unlock the full performance potential of parallel systems. Some examples of AMT runtime and optimization include HPX, TaskFlow, Legion, Charm++, and modern C++ features (i.e. sender/receiver, coroutine, simd).

This project involves several key tasks, including understanding the existing Lulesh benchmark and its data-parallel implementation, identifying AMT optimization opportunities in Lulesh, implementing the AMT version of Lulesh, and evaluating the performance of the AMT implementation at NERSC supercomputer. By completing this project, the student would gain hands-on experience in high performance computing, parallel programming, and AMT programming models. Additionally, the results of this project would provide valuable insights into the performance of AMT for scientific simulation and the NERSC programming environment.

Desired Skills/Background

Experience with task-based programming models, modern C++, Cuda, MPI


Leveraging C++ Standard Parallelism for LU decomposition (Fall'23 Internship)

CS Domains(s)

software engineering, high performance computing, performance optimization and analysis, programming models

Project Description

The primary aim of this project is to explore the power of standard C++ parallelism capabilities, utilizing the Parallel Standard Template Library (Parallel STL) and asynchronous execution. We will utilize various features of the C++ language while developing a LU (lower upper) decomposition library. LU decomposition is a key step in many numerical algorithms used in engineering and scientific computing, including the solution of linear systems, the computation of determinants and inverses, and the solution of linear least-squares problems.

Our work will involve investigating a range of C++ parallel and async techniques applied to the application, such as parallel execution policies, sender/receiver model, std::ranges, std::mdspan. We're set to cover a broad spectrum of parallel technologies, which include but are not limited to, NVIDIA nvc++ stdpar/stdexec, the HPX runtime system, Kokkos, Legion, and Taskflow. This process will enhance the code’s performance, portability, and productivity across CPU and GPUs. Ultimately, we aim to deploy the software on Perlmutter, one of the world’s fastest supercomputers located here at the National Energy Research Scientific Computing Center (NERSC). This project, thus, presents an exciting opportunity to explore and leverage advanced C++ parallel programming techniques in the HPC field.

The project will involve several key tasks: understanding of the HPC application, utilizing C++ Parallel STL and task-based programming to develop the application, and finally, benchmarking the performance on NERSC's supercomputer. The successful completion of this project will provide the intern with significant exposure to high-performance computing, software development, and potential scientific publications.

Desired Skills/Background

experience with C++ parallel STL, asynchronous programming, C++ sender/receiver, HPX


Quantum Computing


Randomized Quantum Linear Algebra in the NISQ Era

Science/CS Domain(s)

Quantum algorithms, linear algebra

Project Description

Quantum Linear Algebra (QLA) problems, such as the quantum linear systems problem and eigenvalue estimation, have attracted considerable attention over the past decade. Quantum Signal Processing and Qubitization are the main algorithms that have been proposed to solve QLA problems in the fault-tolerant era. They rely on encoding the matrix problem in a subspace of a larger-dimensional Hilbert space, a procedure also known as block encoding. For many matrix problems of interest, preparing this encoding requires deep circuits that are out of reach for present-day noisy quantum hardware. Randomized QLA is a promising alternative where the data access model is classical instead of quantum. This brings a significant reduction in circuit complexity at the cost of additional circuit measurements and classical data processing. In this project, we aim to quantitatively compare the performance of QLA algorithms with randomized QLA algorithms for a few problems of interest.

Desired Skills/Background

  • Required: Experience in linear algebra, Python
  • Nice to have: Computational math/physics background, experience with quantum algorithms and a quantum programming toolkit

NERSC Mentor(s)

Daan Camps (dcamps@lbl.gov)


Thermodynamics on Analog Quantum Hardware

Science/CS Domain

Quantum algorithms, statistical mechanics, linear algebra

Project Description

The aim of this project is to develop algorithms and tools that will enable the study of quantum phase transitions on analog quantum hardware. Analog quantum simulation platforms have become a promising means of investigating interacting quantum many-body systems due to the ability to exert greater control over previously inaccessible Hilbert spaces. Our focus will be on constructing phase diagrams and investigating dynamics for specific physical lattice models of interest. To study these models numerically, classical approximation methods, such as tensor networks, will be used and developed. Furthermore, the developed tools may be implemented on analog quantum hardware if available.

Desired Skills/Background

  • Required: Experience in linear algebra, Python
  • Nice to have: computational math/physics/chemistry background, experience with quantum algorithms, and a quantum programming toolkit

NERSC Mentor(s)

Katie Klymko (kklymko@lbl.gov)