NERSCPowering Scientific Discovery Since 1974

NERSC Summer Internships

NERSC hosts a number of internships every summer. Applicants must be US-based students, actively enrolled in undergraduate or graduate programs. These are paid internships, but we are unable to provide additional support for housing. Desired technical qualifications are specified with each project description. This page will be updated with more projects, so check back for further additions. To see a list of the previous year's internship projects, click hereIn addition to the projects below, NERSC hosts other projects via the Lab's CS Summer Student Program. For all summer positions, including those at NERSC, the hourly wages are up to $21.16 for undergraduate students and up to $38.40 for graduate students, depending upon years of education completed. Applicants are responsible for housing and travel expenses.

To apply for one of the internships below, please reach out to the listed NERSC mentors directly and send your CV/resume.

Summer 2023 Internship Projects


Data and Analytics

Accelerating realtime data processing for the DIII-D fusion experiment

This project is filled and no longer accepting applications.

Science/CS domain: magnetic confinement fusion, data analysis, code optimization

Project description: The goal of this project is to speed up the charge exchange data processing for the DIII-D tokamak at NERSC. The charge-exchange data is required to run a between-shot workflow that can reconstruct the DIII-D plasma profiles every ~10 minutes; this workflow cannot start until the charge-exchange data analysis is complete. We will profile the current charge-exchange data analysis code and determine opportunities for achieving speedup, likely through MPI parallelization and/or code optimization. We will implement these improvements and check to ensure that the overall application correctness is preserved. Once we have achieved some speedup, we will examine the full realtime equilibrium reconstruction workflow, including the cost of transferring the raw input data from DIII-D to NERSC. We will evaluate whether the full workflow is faster with NERSC-based data processing or DIII-D-based data processing. In either case, achieving speedup will be critical to achieving between-shot equilibrium reconstruction. 

Desired Skills/Background: experience with C++ and/or FORTRAN and MPI

Nice: code profiling, code optimization, writing and running unit tests for correctness

NERSC/DAS mentor: Laurie Stephey (lastephey@lbl.gov) 


Scalable Deployment of Data Services with Helm

This position has been filled for summer 2023 and is no longer accepting applicants.

Science/CS domain
: DevOps, Backend Development, Data Management

Project Description: Linux containers have become an immensely popular software development paradigm due to their ability to provide lightweight and reproducible software runtime encapsulation as well as extremely portable and scalable application deployment.  At NERSC, the Data & Analytics services team uses containers to deploy a variety of data services to over 8000 active users.  The goal of this project is to migrate NERSC science database and data portal services from monolithic containers to helm-templated microservices.  This will enable several improvements, including easier and more frequent version upgrades and the possibility for users to self-administer these services on NERSC's Kubernetes-as-a-Service platform, Spin.

Desired Skills/Background: Some scripting or programming experience and experience with or interest in using container technology (e.g. Docker, Podman).

Nice: Any experience with Kubernetes, Helm, CI/CD in Gitlab/Github, web server configuration (NGINX/Apache), databases (Mongo/Postgres/MySQL), web app or microservice development.

NERSC/DAS mentor: Dan Fulton


Machine Learning

Deep learning for climate simulations

This position has been filled for Summer 2023.

Science/CS domain: Scientific Machine learning, HPC, Weather/Climate

Project Description: Simulating the earth’s climate with high fidelity in high resolution requires significant computational resources. Today, with the advent of deep learning models and availability of large volumes of simulation and observational data, data-driven models have the enormous potential to augment traditional numerical models by providing orders of magnitude speedup in compute and, hence, enabling the use of massive ensembles to predict low likelihood and high impact extreme events under different climate warming scenarios. In this project, we aim to use state-of-the-art Fourier forecasting networks (based on Transformers) and HPC software to understand and characterize the performance of deep learning models in simulating the physical behavior of Earth’s atmospheric processes and associated extreme events. This will involve exploration of climate simulation data, development of model architectures and underlying foundational aspects of such models, and using NERSC HPC compute resources on the Perlmutter supercomputer to train and analyze these large models.

Desired Skills/Background: Python, deep learning, PyTorch/TensorFlow, interest in climate science

Nice: Distributed training of ML models

NERSC/DAS mentor: Shashank Subramanian, Peter Harrington


Accelerating Diffusion Models in High Energy Physics

This position has been filled for Summer 2023.

Science/CS domain: Scientific Machine learning, HPC, HEP/Collider Physics

Project Description: Generative models are widely used in High Energy  Physics as surrogate models whose aim is to replace computationally expensive simulation routines. Recently, diffusion models have been proposed as powerful generative models, achieving state-of-the-art performance in computer vision and encouraging performance for collider physics applications. One of the main challenges for these models is the time it takes to generate new observations. In this project, you are going to explore new strategies to accelerate the sampling time of diffusion models. These include the investigation of distillation models and fast ODE solvers, comparing the speed up times and the physics performance achieved by these strategies.

Desired Skills/Background: experience with Python, Tensorflow/PyTorch, interest in collider physics and diffusion models

Nice: distributed training, physics background

NERSC/DAS mentor: Vinicius Mikuni (vmikuni@lbl.gov) 


Build a platform for supercomputer-powered scientific AI workflows

Science/CS domain: AI+HPC, software engineering, MLOps, distributed ML

Project description: As applications of AI in science grow and mature, researchers increasingly need advanced tools to help leverage the power of massive supercomputers. Their research often requires sophisticated parallel workflows including distributed model training, hyperparameter optimization (HPO), and/or inference on massive datasets. Such workflows can be cumbersome to implement and manage by hand, slowing research and scientific progress. As an intern on this project, you will develop the tools and services to enable reproducible, scalable, automated AI workflows on NERSC supercomputers with intuitive, interactive interfaces. This platform for scientific AI research will fuel the next generation of AI-driven scientific discoveries.

Desired Skills/Background:
Required: experience with python, either ML software or HPC workflows, interest in AI+HPC
Nice to have: experience with distributed ML, libraries like PyTorch, TensorFlow, Jupyter, W&B, MLflow, ClearML

NERSC/DAS mentors: Steven Farrell (sfarrell@lbl.gov), Peter Harrington (pharrington@lbl.gov)


Scientific AI benchmarking and performance analysis

Science/CS domain: deep learning, benchmarking, performance optimization

Project Description: Scientific AI/ML/DL applications are a transformative emerging workload for supercomputers, and it is critical for HPC centers to have robust methodologies and benchmarks for characterizing these new workloads, evaluating system performance, and to drive innovation in hardware and system design. MLPerf benchmarks and related efforts are pushing on this front with state of the art applications and performance measurements for HPC science. We are looking for an enthusiastic intern to optimize and analyze the performance of scientific AI benchmarks at scale on the Perlmutter supercomputer, a powerful system featuring over 6,000 NVIDIA A100 GPUs which debuted as the #5 system on the Top500 in 2021 and had leading results on MLPerf HPC v1.0. The intern will have the opportunity to work on a variety of potential tasks including training models at scale, code profiling and optimization, power measurement and analysis, hyperparameter optimization, and researching new ideas to accelerate and scale AI model training.

Desired Skills/Background:
Required: Python, machine learning, experience with PyTorch or TensorFlow
Nice to have: distributed deep learning, GPU profiling, hyperparameter optimization, model parallelism

NERSC/DAS mentor: Steven Farrell (sfarrell@lbl.gov), Neil Mehta (neilmehta@lbl.gov


Jupyter

Enhancing Jupyter Capabilities and Infrastructure at NERSC

THIS POSITION HAS BEEN FILLED

CS domain: Sofware Engineering, Data Engineering

Project description: Scientists love Jupyter because it combines text, visualization, data analytics, and code into a document they can share, modify, and even publish. What about using Jupyter to control experiments in real-time, or steer complex simulations on a supercomputer, or even combining aspects of both workflows?  How do users make use of Jupyter on a supercomputer?  We are looking for Python and Jupyter enthusiasts to help us find ways to expose NERSC's high-performance computing and storage systems through Jupyter, making supercomputing more literate and more user friendly.  This project will involve analyzing Jupyter activity at NERSC and developing software tools to automate analysis of Jupyter usage to gain insight into how users interact with our systems through Jupyter, and how we can improve performance and stability of Jupyter on the Perlmutter supercomputer.

Desired Skills/Background: Python, Jupyter

NERSC mentors: Rollin Thomas, Kelly Rowland, Shreyas Cholia


Programming Environment and Models

Porting Lulesh benchmark to Asynchronous Many Tasks Runtime

CS domain: software engineering, high performance computing, performance optimization and analysis,  programming models

Project description: The Lulesh benchmark (Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics) is a widely used tool for evaluating the performance of parallel computing systems and has been ported to various programming models, including OpenMP, Kokkos, Cuda, etc. However, its current implementation is based on a data-parallel programming model, which potentially limits its ability to fully leverage the capabilities of modern parallel systems. The goal of this project is to explore porting Lulesh to an asynchronous many task (AMT) runtime plus GPU offload to unlock the full performance potential of parallel systems. Some examples of AMT runtime and optimization include HPX, TaskFlow, Legion, Charm++, and modern C++ features (i.e. sender/receiver, coroutine, simd). 

This project would involve several key tasks, including: understanding the existing Lulesh benchmark and its data-parallel implementation, identifying AMT optimization opportunity in Lulesh, implementing the AMT version of Lulesh, and evaluating the performance of the AMT implementation at NERSC supercomputer. By completing this project, the student would gain hands-on experience in high-performance computing, parallel programming, and AMT programming models. Additionally, the results of this project would provide valuable insights into the performance of AMT for scientific simulation and the NERSC programming environment.

*For full consideration apply by 2/24/2023.

Desired Skills/Background: experience with task-based programming model, modern C++, Cuda, MPI

Quantum Computing

Randomized Quantum Linear Algebra in the NISQ Era

THIS POSITION HAS BEEN FILLED

Science/CS domain: quantum algorithms, linear algebra

Project description: Quantum Linear Algebra (QLA) problems, such as the quantum linear systems problem and eigenvalue estimation, have attracted considerable attention over the past decade. Quantum Signal Processing and Qubitization are the main algorithms that have been proposed to solve QLA problems in the fault-tolerant era. They rely on encoding the matrix problem in a subspace of a larger-dimensional Hilbert space, a procedure also known as block encoding. For many matrix problems of interest, preparing this encoding requires deep circuits that are out of reach for present day noisy quantum hardware. Randomized QLA is a promising alternative where the data access model is classical instead of quantum. This brings a significant reduction in circuit complexity at the cost of additional circuit measurements and classical data processing. In this project, we aim to quantitatively compare the performance of QLA algorithms with randomized QLA algorithms for a few problems of interest.

Desired Skills/Background:  

Required: experienced in linear algebra, Python

Nice to have: computational math/physics background, experience with quantum algorithms and a quantum programming toolkit

NERSC/ATG mentors: Daan Camps (dcamps@lbl.gov)

Thermodynamics on Analog Quantum Hardware

THIS POSITION HAS BEEN FILLED

Science/CS domain: quantum algorithms, statistical mechanics, linear algebra

Project description: The aim of this project is to develop algorithms and tools that will enable the study of quantum phase transitions on analog quantum hardware. Analog quantum simulation platforms have become a promising means of investigating interacting quantum many-body systems due to the ability to exert greater control over previously inaccessible Hilbert spaces. Our focus will be on constructing phase diagrams and investigating dynamics for specific physical lattice models of interest. To study these models numerically, classical approximation methods, such as tensor networks, will be used and developed. Furthermore, the developed tools may be implemented on analog quantum hardware if available.

Desired Skills/Background:  

Required: experienced in linear algebra, Python

 NERSC/ATG mentors: Katie Klymko (kklymko@lbl.gov)

 Nice to have: computational math/physics/chemistry background, experience with quantum algorithms and a quantum programming toolkit