NERSCPowering Scientific Discovery Since 1974

NERSC Summer Internships

NERSC hosts a number of internships every summer. Applicants must be students, actively enrolled in undergraduate or graduate programs. These are paid internships, but we are unable to provide additional support for travel or housing. Desired technical qualifications are specified with each project description.

See this year's list of projects below. If you are interested in a project please apply at the following link and also contact the listed staff member with your CV.

2019 projects

Enhancing Jupyter Capabilities at NERSC

CS domain: Data Analysis/ System Software
Project description: Scientists love Jupyter because it combines text, visualization, data analytics, and code into a document they can share, modify, and even publish. What about using Jupyter to control experiments in real-time, or steer complex simulations on a supercomputer, or even combine aspects of both workflows --- what would it take? We are looking for Python, Jupyter, and JavaScript enthusiasts to help us find ways to expose NERSC’s high-performance computing and storage systems through Jupyter, making supercomputing more literate and more user friendly.
Desired Skills/Background: Python, Jupyter, JavaScript
NERSC mentor: Shane Canon (scanon@lbl.gov), Rollin Thomas (rcthomas@lbl.gov)

Jupyter Tools for the Superfacility

Science/CS domain:  Computer science
Project description:  Our users love Jupyter because it combines text, visualization, data analytics, and code into a document they can share, modify, and even publish. What about using Jupyter to control experiments in real-time, or steer complex simulations on a supercomputer, or even combine aspects of both workflows --- what would it take? We are looking for Python, Jupyter, and JavaScript enthusiasts to help us find ways to expose NERSC’s high-performance computing and storage systems through Jupyter, making supercomputing more literate and more user friendly. You will work directly with science use cases, including the Advanced Light Source and the National Center for Electron Microscopy, to build tools using Jupyter notebooks and Jupyterlab that allow users to directly interact with large distributed workflows for high performance computing and data intensive science.
Desired Skills/Background:  CS background, Jupyter, JavaScript, Python, REST and web technologies.
NERSC mentor: Shreyas Cholia <scholia@lbl.gov>, Rollin Thomas <rcthomas@lbl.gov>

Physics-informed GANs for complex systems

Science/CS domain: Physics / Machine Learning / Deep Learning
Project description: Generative Adversarial Networks are a powerful machine learning method that have had significant success recently in tackling challenging problems in computer vision and speech recognition. GANs have also been shown to be able to replicate the complex distributions of physical systems, including in cosmology and turbulence. However, systematic ways of incorporating the wealth of knowledge about the physics of such systems remains largely unexplored. In this project we propose to incorporate systematically constraints from the physics (governing equations) and statistics (emergent properties) of complex systems into a GAN framework in order to improve their interpolation and extrapolation properties, their accuracy, speed and stability.
Desired Skills/Background: Python, Machine Learning, Physics, Math
NERSC mentor: Karthik Kashinath (kkashinath@lbl.gov), Adrian Albert (aalbert@lbl.gov)

Spatio-temporal GANs for complex systems, with applications to turbulent flows and hydro/climate modeling

Science/CS domain: Physics / Machine Learning / Deep Learning
Project description: Generative Adversarial Networks are a powerful machine learning method that have had significant success recently in tackling challenging problems in computer vision and speech recognition. GANs have also been shown to be able to replicate the complex distributions of physical systems, including cosmology and turbulence. However, a systematic treatment of the temporal characteristics of such systems remains unexplored. In this project we propose to incorporate systematically the temporal evolution and coherence in complex systems into a GAN framework in order to be able to predict the space-time evolution of turbulent flows. This project will lead the way in synthesizing work in spatio-temporal statistics and ML with state-of-the-science generative models.
Desired Skills/Background: Python, Machine Learning, Physics, Math
NERSC mentor: Karthik Kashinath (kkashinath@lbl.gov), Adrian Albert (aalbert@lbl.gov) 

Deep Learning on graph structured scientific data

Science/CS domain: physics / machine learning
Project description: Many scientific domains have data in irregular structured form such as non-uniform grids, point clouds, graphs, and meshes. A growing class of deep learning models known as Geometric Deep Learning combines the power of deep neural networks with the ability to exploit the rich structure and relationships in these datasets to enable scientific discovery. In this project, you will explore new applications of Graph Neural Networks to solve problems in a domain according to the experience and interest of the applicant. Possible problems include pattern recognition, classification, and generative modeling, while possible domains include particle physics, cosmology, molecular science, and environmental science.
Desired Skills/Background: Deep Learning
NERSC mentor: Steve Farrell (sfarrell@lbl.gov) 

Automating Neural Network Search

Science/CS domain: deep learning / software engineering
Project description: Neural networks are powering the majority of the recent achievements claimed by machine learning. Designing and tuning these networks for an ever increasing number of applications is becoming a major challenge. In this project, we will work on evaluating and productionizing platforms for carrying hyperparameters and neural architecture search at large scale. This project can be a good entry point for someone with good engineering skills into cutting edge innovations in deep learning.
Desired Skills/Background: Computers clusters, scripting, interest in deep learning
NERSC mentor: Steve Farrell (sfarrell@lbl.gov), Mustafa Mustafa (mmustafa@lbl.gov)

Deep Learning for Cross-Scale Material Analysis

Science/CS domain: Deep Learning/ Physics/ GIS/ Chemistry
Project description: Taking high resolution hyperspectral images of materials (e.g. nano-scale images) help scientist identify the chemical composition of the substrates. However, such high resolution images are prohibitively expensive to allow for characterizing the composition of large substrates. For such larger scales, scientist take low resolution hyperspectral data (e.g. micro-scale), the challenge is to use the limited high resolution spectra data to “improve” the resolution of the lower resolution spectra. This project aims to use neural networks to make this cross-scale analysis of spectroscopic images of shale rocks. A stretch goal is to apply the same approach to cross-scale remote sensing images.
Desired Skills/Background: Deep Learning OR Physics/Chemistry/GIS with strong interest in neural networks
Earth and Environmental Science Mentor: Zhao Hao (zhao@lbl.gov)
NERSC mentor: Mustafa Mustafa (mmustafa@lbl.gov)

Supercomputing API server

CS domain: API server development, API, Databases
Project description: This project develops the foundations of an API server to handle real-time experiment data processing. This would entail running a REST API server on our in-house docker architecture. The server would keep data in a relational database. The first API supported is the “Status API” which reports the status and expected maintenance times of the various compute and storage components at NERSC. The server should be designed such that adding future APIs to support other functionality (e.g., data movement, workflow processing) is straightforward.
Desired Skills/Background: API server development using Node or Python, Relational Databases, Web technologies
NERSC Mentor: Gabor Torok (gtorok@lbl.gov)

OAuth2 and SAML Authentication for API and Web Services

CS domain: Authentication, Web service development
Project description: This project will build core infrastructure for using modern web authentication protocols for API and web services in a supercomputing environment. Today we commonly see the ability to login to web services using a Google or Facebook account, and this project will use the same underlying technologies, but in a scientific context. The work will include helping to design an authentication and authorization model as well as helping to develop and deploy the authentication infrastructure itself. We will be working directly with researchers and colleagues at other facilities to test and prototype the system with a variety of large-scale research projects.
Desired Skills/Background: Software development using Java, Python, or other languages, Authentication technologies, Web technologies
NERSC Mentor: Mark Day (mrday@lbl.gov)

Queue modelling using SLURM simulation

Science/CS domain:  CS/ Math
Project description:  The goal of this project would be to analyse traces of atleast a years worth of scheduling data using slurm simulator to understand if current scheduling model imposes some upper limit on resource management. Our current queue design still lends itself to atleast multiple jobs always waiting to get scheduled (but never getting scheduled). Student will understand and develop algorithmic approaches to tackle this problem, and apply to current design of NERSC queues. Can we create better queuing models so as to never have un-schedulable jobs?
Desired Skills/Background:  CS and/or Math background. With an interest in modelling, Queueing theory.
NERSC mentor: Aditi Gaur (agaur@lbl.gov)

Intent-based network resource management for superfacility model

Science/CS domain:  Computer Science, networking
Project description:  Intent-based resource management can enhance the performance of scientific workflows based on high-level user requirements. Intent can facilitate scientific workflows by automating provisioning of networking and other resources such as computing and storage. This project aims to investigate how intent is translated to 'elastic' resource configurations, especially to reserve bandwidth and assign QoS values on a network link for a certain time duration. We will build on prior work with INDIRA, EVIAN and SENSE to develop intent-based resource scheduling method with policy-based negotiation and study resource sharing policies and algorithms.
Desired Skills/Background:  CS background. Python 3, Flask, RDF, Markup languages, queuing theory
NERSC mentor: Alex Sim <asim@lbl.gov>, Mariam Kiran <mkiran@es.net>

Systems Data Analyst

Project description: NERSC's flagship system, Cori, is presently the twelfth fastest supercomputer in the world and generates tens of terabytes of system monitoring data per day.  Analysis of hardware counters and system logs will increase understanding of the performance of individual applications and of the system as a whole. The candidate will analyze these data to help improve the design and operation of existing and future systems.

Desired Skills: statistical analysis techniques (including machine learning), Python and libraries relevant to data analytics (including scikit-learn, Pandas, PySpark and matplotlib).  HPC systems architecture and applications knowledge.

NERSC mentor:  Taylor Groves, Brian Austin

HPC OpenMP and accelerators

Project description: The DOE uses OpenMP to improve application performance on modern supercomputers. There is particular interest in newer OpenMP standards which support offloading of computations to accelerators such as GPUs and FPGAs. In this project, the summer intern will evaluate OpenMP on accelerators to improve our understanding about how to use OpenMP efficiently on Perlmutter and other supercomputers with accelerators.

Desired skills: familiarity with Linux environments and programming in C, C++ or Fortran. OpenMP or GPU programming is strongly preferred.

NERSC mentor: Chris Daley