NERSCPowering Scientific Discovery Since 1974

NERSC Summer Internships

NERSC hosts a number of internships every summer. Applicants must be students, actively enrolled in undergraduate or graduate programs. These are paid internships, but we are unable to provide additional support for travel or housing. Desired technical qualifications are specified with each project description.

2019 Projects

Enhancing Jupyter Capabilities at NERSC

Applications for this project are closed.

CS domain: Data Analysis/ System Software
Project description: Scientists love Jupyter because it combines text, visualization, data analytics, and code into a document they can share, modify, and even publish. What about using Jupyter to control experiments in real-time, or steer complex simulations on a supercomputer, or even combine aspects of both workflows — what would it take? We are looking for Python, Jupyter, and JavaScript enthusiasts to help us find ways to expose NERSC’s high-performance computing and storage systems through Jupyter, making supercomputing more literate and more user friendly.
Desired Skills/Background: Python, Jupyter, JavaScript
NERSC mentors: Shane Canon (scanon@lbl.gov), Rollin Thomas (rcthomas@lbl.gov)

Jupyter Tools for the Superfacility

Applications for this project are closed. 

Science/CS domain: Computer science
Project description: Our users love Jupyter because it combines text, visualization, data analytics, and code into a document they can share, modify, and even publish. What about using Jupyter to control experiments in real-time, or steer complex simulations on a supercomputer, or even combine aspects of both workflows — what would it take? We are looking for Python, Jupyter, and JavaScript enthusiasts to help us find ways to expose NERSC’s high-performance computing and storage systems through Jupyter, making supercomputing more literate and more user friendly. You will work directly with science use cases, including the Advanced Light Source and the National Center for Electron Microscopy, to build tools using Jupyter notebooks and Jupyterlab that allow users to directly interact with large distributed workflows for high performance computing and data intensive science.
Desired Skills/Background: CS background, Jupyter, JavaScript, Python, REST and web technologies.
NERSC mentors: Shreyas Cholia (scholia@lbl.gov), Rollin Thomas (rcthomas@lbl.gov)

Physics-informed GANs for complex systems

Science/CS domain: Physics / Machine Learning / Deep Learning
Project description: Generative Adversarial Networks are a powerful machine learning method that have had significant success recently in tackling challenging problems in computer vision and speech recognition. GANs have also been shown to be able to replicate the complex distributions of physical systems, including in cosmology and turbulence. However, systematic ways of incorporating the wealth of knowledge about the physics of such systems remains largely unexplored. In this project we propose to incorporate systematically constraints from the physics (governing equations) and statistics (emergent properties) of complex systems into a GAN framework in order to improve their interpolation and extrapolation properties, their accuracy, speed and stability. 
Desired Skills/Background: Python, Machine Learning, Physics, Math
NERSC mentors: Karthik Kashinath (kkashinath@lbl.gov), Adrian Albert (aalbert@lbl.gov)

Spatio-temporal GANs for complex systems, with applications to turbulent flows and hydro/climate modeling

Science/CS domain: Physics / Machine Learning / Deep Learning
Project description: Generative Adversarial Networks are a powerful machine learning method that have had significant success recently in tackling challenging problems in computer vision and speech recognition. GANs have also been shown to be able to replicate the complex distributions of physical systems, including cosmology and turbulence. However, a systematic treatment of the temporal characteristics of such systems remains unexplored. In this project we propose to incorporate systematically the temporal evolution and coherence in complex systems into a GAN framework in order to be able to predict the space-time evolution of turbulent flows. This project will lead the way in synthesizing work in spatio-temporal statistics and ML with state-of-the-science generative models. 
Desired Skills/Background: Python, Machine Learning, Physics, Math
NERSC mentors: Karthik Kashinath (kkashinath@lbl.gov), Adrian Albert (aalbert@lbl.gov)

Deep Learning on graph structured scientific data

Science/CS domain: physics / machine learning
Project description: Many scientific domains have data in irregular structured form such as non-uniform grids, point clouds, graphs, and meshes. A growing class of deep learning models known as Geometric Deep Learning combines the power of deep neural networks with the ability to exploit the rich structure and relationships in these datasets to enable scientific discovery. In this project, you will explore new applications of Graph Neural Networks to solve problems in a domain according to the experience and interest of the applicant. Possible problems include pattern recognition, classification, and generative modeling, while possible domains include particle physics, cosmology, molecular science, and environmental science.
Desired Skills/Background: Deep Learning
NERSC mentor: Steve Farrell (sfarrell@lbl.gov)

Automating Neural Network Search

Science/CS domain: deep learning / software engineering
Project description: Neural networks are powering the majority of the recent achievements claimed by machine learning. Designing and tuning these networks for an ever increasing number of applications is becoming a major challenge. In this project, we will work on evaluating and productionizing platforms for carrying hyperparameters and neural architecture search at large scale. This project can be a good entry point for someone with good engineering skills into cutting edge innovations in deep learning.}
Desired Skills/Background: Computers clusters, scripting, interest in deep learning
NERSC mentors: Steve Farrell (sfarrell@lbl.gov), Mustafa Mustafa (mmustafa@lbl.gov)

Deep Learning for Cross-Scale Material Analysis

Science/CS domain: Deep Learning/ Physics/ GIS/ Chemistry
Project description: Taking high resolution hyperspectral images of materials (e.g. nano-scale images) help scientist identify the chemical composition of the substrates. However, such high resolution images are prohibitively expensive to allow for characterizing the composition of large substrates. For such larger scales, scientist take low resolution hyperspectral data (e.g. micro-scale), the challenge is to use the limited high resolution spectra data to “improve” the resolution of the lower resolution spectra. This project aims to use neural networks to make this cross-scale analysis of spectroscopic images of shale rocks. A stretch goal is to apply the same approach to cross-scale remote sensing images.
Desired Skills/Background: Deep Learning OR Physics/Chemistry/GIS with strong interest in neural networks
Earth and Environmental Science Mentor: Zhao Hao (zhao@lbl.gov)
NERSC mentor: Mustafa Mustafa (mmustafa@lbl.gov)

Supercomputing API server

CS domain: API server development, API, Databases
Project description: This project develops the foundations of an API server to handle real-time experiment data processing. This would entail running a REST API server on our in-house docker architecture. The server would keep data in a relational database. The first API supported is the “Status API” which reports the status and expected maintenance times of the various compute and storage components at NERSC. The server should be designed such that adding future APIs to support other functionality (e.g., data movement, workflow processing) is straightforward.
Desired Skills/Background: API server development using Node or Python, Relational Databases, Web technologies
NERSC Mentor: Gabor Torok (gtorok@lbl.gov)

OAuth2 and SAML Authentication for API and Web Services

CS domain : Authentication, Web service development
Project description: This project will build core infrastructure for using modern web authentication protocols for API and web services in a supercomputing environment. Today we commonly see the ability to login to web services using a Google or Facebook account, and this project will use the same underlying technologies, but in a scientific context. The work will include helping to design an authentication and authorization model as well as helping to develop and deploy the authentication infrastructure itself. We will be working directly with researchers and colleagues at other facilities to test and prototype the system with a variety of large-scale research projects.
Desired Skills/Background: Software development using Java, Python, or other languages, Authentication technologies, Web technologies
NERSC Mentor: Mark Day (mrday@lbl.gov)

Queue modelling using SLURM simulation

Science/CS domain : CS/ Math
Project description: The goal of this project would be to analyse traces of at least a year's worth of scheduling data using slurm simulator to understand if current scheduling model imposes some upper limit on resource management. Our current queue design still lends itself to at least multiple jobs always waiting to get scheduled (but never getting scheduled). Student will understand and develop algorithmic approaches to tackle this problem, and apply to current design of NERSC queues. Can we create better queuing models so as to never have un-schedulable jobs?
Desired Skills/Background: CS and/or Math background. With an interest in modelling, Queueing theory.
NERSC mentor: Aditi Gaur (agaur@lbl.gov)

Intent-based network resource management for superfacility model

Science/CS domain : Computer Science, networking
Project description: Intent-based resource management can enhance the performance of scientific workflows based on high-level user requirements. Intent can facilitate scientific workflows by automating provisioning of networking and other resources such as computing and storage. This project aims to investigate how intent is translated to 'elastic' resource configurations, especially to reserve bandwidth and assign QoS values on a network link for a certain time duration. We will build on prior work with INDIRA, EVIAN and SENSE to develop intent-based resource scheduling method with policy-based negotiation and study resource sharing policies and algorithms.
Desired Skills/Background: CS background. Python 3, Flask, RDF, Markup languages, queuing theory
NERSC mentors: Alex Sim (asim@lbl.gov), Mariam Kiran (mkiran@es.net)

Systems Data Analyst

Project description: NERSC's flagship system, Cori, is presently the twelfth fastest supercomputer in the world and generates tens of terabytes of system monitoring data per day. Analysis of hardware counters and system logs will increase understanding of the performance of individual applications and of the system as a whole. The candidate will analyze these data to help improve the design and operation of existing and future systems.
Desired Skills: statistical analysis techniques (including machine learning), Python and libraries relevant to data analytics (including scikit-learn, Pandas, PySpark and matplotlib). HPC systems architecture and applications knowledge.
NERSC mentors: Taylor Groves (tgroves@lbl.gov), Brian Austin (baustin@lbl.gov)

Mixed precision performance and accuracy

Project description: Many HPC codes rely entirely on 64 bit floating point types. However, significant performance gains can be obtained through reduced precision. 32 bit floating point operations are widely supported and 16 bit operations are possible with native support on some architectures such as GPUs. This project will involve implementing and evaluating the performance of mixed precision kernels in the context of material physics applications.  
Desired Skills: computer science, math, or physics; fortran, gpu, performance analysis
NERSC mentors: Brandon Cook (bgcook@lbl.gov), Thorsten Kurth (tkurth@lbl.gov)

Automating Performance Analysis

Project description: A number of HPC codes that run at NERSC and elsewhere need to quantify their performance regularly. The standard method is manually implementing tooling and interactively using profilers. NERSC has been developing a framework to simplify recording these metrics (timing, memory, hardware counters, rooflines) and the capability to upload these metrics to a database that provides tracking of these metrics over time. The framework is currently available in stand-alone or hybrid C/C++/CUDA/Python and supports multiple levels of parallelism. Implementing an arbitrary number of metrics in a region of code typically requires only 1 or 2 lines and output to text and/or JSON is automated at the application termination. The latter format can then be read by the Python interface and plotted. In the context of continuous-integration, the outputs and plots are uploaded to a testing dashboard hosted at NERSC. This project is still in beta-testing and the candidate will have a number of options to work on, including but not limited to: streamlining continuous-integration (Python, CMake/CTest/CDash), adding tools/capabilities (e.g. CUPTI support), writing Fortran bindings, implementing the time-series tracking (Python, web), and implementing the framework in existing projects. A candidate who chooses to include working on implementing the framework into existing projects will gain exposure to the large number of research projects across a variety of domains that work with NERSC.   
Desired Skills: Performance analysis, experience in one or more of the following: C, C++, Python, CUDA, Fortran, GPUs, web, CMake/CTest 
NERSC mentors: Jonathan Madsen (jrmadsen@lbl.gov), Brandon Cook (bgcook@lbl.gov)