Early Career Achievement Award Seminar Series

Overview

NERSC is hosting an online seminar series featuring talks from, and discussions with, the recipients of the NERSC Achievement Awards for early career scientists. The speakers will give a description of their research and significant results, describe their computational methods and/or strategies, and relate notable HPC challenges or successes at NERSC. They will also share their thoughts on what it’s like to be an early career computational scientist in today's environment.

The talks are open to anyone, see "Connection Information" below.

2022 Schedule

Date	Presenter	Title	Time (Pacific / Eastern)
Tuesday, December 13	Giulia Palermo, University of California, Riverside	Dynamics and mechanisms of CRISPR-Cas9 through the lens of computational methods	11:00 PST / 2:00 EST
	Bin Ouyang, Florida State University	High entropy design principles for battery materials with fast diffusion	11:30 PST / 2:30 EST
	Andi Gu, Harvard University	GIGA-Lens: A Fast Differentiable Bayesian Inference Framework for Strong Lensing	12:00 PST / 3:00 EST
Wednesday, December 14	Chirag Jain, Indian Institute of Science	Sketch-based algorithms for large-scale whole-genome comparisons	9:00 PST / 12:00 EST

Connection Information

Berkeley Lab employees and affiliates: the ZOOM info is on the "NERSC Public Events" calendar.
NERSC Users: See your NERSC weekly email or this page.

Abstracts

Dynamics and mechanisms of CRISPR-Cas9 through the lens of computational methods

Giulia Palermo, University of California, Riverside

December 13, 2022

11:00-11:30 Pacific Time

The clustered regularly interspaced short palindromic repeat (CRISPR) genome-editing revolution established the beginning of a new era in life sciences. I will report the role of state-of-the-art computations in the CRISPR-Cas9 revolution, from the early refinement of cryo-EM data to enhanced simulations of large-scale conformational transitions. Molecular simulations reported a mechanism for RNA binding and the formation of a catalytically competent Cas9 enzyme, in agreement with subsequent structural studies. Inspired by single-molecule experiments, molecular dynamics offered a rationale for the onset of off-target effects, while graph theory unveiled the allosteric regulation. Finally, the use of a mixed quantum-classical approach established the catalytic mecha- nism of DNA cleavage. Overall, molecular simulations have been instrumental in understanding the dynamics and mech- anism of CRISPR-Cas9, contributing to understanding func- tion, catalysis, allostery, and specificity.

High entropy design principles for battery materials with fast diffusion

Bin Ouyang, Florida State University

December 13, 2022
11:30-12:00 Pacific Time

High entropy materials promise a lot of opportunities in various structural and functional materials. However, research discussing whether it is useful for battery applications is just rising on the horizon. The first question to ask is, what can “high entropy” do to battery materials? Here in this presentation, we would like to provide a theoretical/computational answer to this question with confirmation from experiments. With great support from high throughput first principle calculations at the world-leading supercomputing platform --- NERSC, it has been found that high entropy provides a) more random atomic distribution that helps avoid the formation of detrimental chemical short-range order that hurts the Li percolation; b) a rich ensemble of local environments that can facilitate new percolation pathways for superionic conduction. The discovery a) has led to the discovery of low-cost battery cathodes that can charge within 10min to reach commercialized Li-ion battery capacity. The discovery b) has revealed a general mechanism that can turn bad ionic conductors into superionic conductors, which creates a new design paradigm for battery electrodes and solid- state electrolytes.

GIGA-Lens: A Fast Differentiable Bayesian Inference Framework for Strong Lensing

Andi Gu, Harvard University

December 13, 2022
12:00-12:30 Pacific Time

Strong gravitational lensing systems constitute a powerful tool for cosmology. They are uniquely suited to probe the low-end of the dark matter mass function and test the predictions of the cold dark matter model beyond the local universe. Multiply lensed quasars (and supernovae in the near future) are being used to provide independent constraints on the Hubble constant. In this talk, we present GIGA-Lens: a gradient-informed, GPU-accelerated Bayesian framework for modeling strong gravitational lensing systems, implemented in TensorFlow and JAX. The three components, optimization using multi-start gradient descent, posterior covariance estimation with variational inference, and sampling via Hamiltonian Monte Carlo, all take advantage of gradient information through automatic differentiation and parallelization on GPUs. The average time to model a simulated system on four Nvidia A100 GPUs is 105 seconds. The robustness, speed, and scalability offered by this framework make it possible to model the large number of strong lenses found in current surveys and present a very promising prospect for the modeling of O(10^5) lensing systems expected to be discovered in the era of the Rubin Observatory, Euclid, and Roman Space Telescope.

Sketch-based algorithms for large-scale whole-genome comparisons

Chirag Jain, Indian Institute of Science

December 14, 2022
9:00-9:30 Pacific Time

In the era of exponential data growth, sketching has become a standard algorithmic technique for rapid genome sequence comparison. I will describe our work on fast, lightweight approximate sequence mapping algorithm by using minimizer sampling and MinHash techniques. The proposed algorithm computes the positional origin of a query sequence in a given reference and estimates nucleotide-level identity under an assumed probabilistic model of mutations. We show an application of this algorithm in quantifying relatedness between two microbial genomes and its impact in tackling a long-standing biological question. Microbiologists are increasingly turning to whole-genome sequencing driven approaches to address fundamental questions associated with ecology. This involves quantifying similarity of two or more genomes, e.g., to check whether a newly sequenced genome is novel, or where else has it been seen before. We developed FastANI (Average Nucleotide Identity) software by using the proposed approximate sequence matching framework to quantify similarity of two or more genomes. Our algorithmic improvements, coupled with parallelizability allowed us to index entire database of 90,000 bacterial genomes, and compute pairwise ANI values among all pairs of genomes for the first time. This analysis sheds light on the extent to which microbes form discrete clusters (species).