Jack Deslippe specializes in the support of material science applications and users at NERSC. He is engaged in evaluating and improving the suitability of these applications for potential N8 architectures. He additionally works on bringing dynamic web-content to users through MyNERSC, the MOTD system, Completed Jobs Pages, ALS Science Gateway Projects and the NERSC mobile site, m.nersc.gov.
Jack is a PI on a SCIDAC project (http://excited-state-scidac.org/) and is one of the lead developers of the BerkeleyGW package for computing the excited state properties of materials.
Jack is the NERSC PI on the Berkeley Lab Directed Research project that is delivering real-time data analysis to ALS scientists through ESNET and NERSC resources. He is the developer the ALS analysis and simulation web-portal at NERSC.
He received a Ph.D. from UC Berkeley in physics in 2011. His research centered on materials physics and nano-science: scaling many-body Green's function computational methods for the study of the optical properties of materials with large and complex structures.
Douglas Doerfler, Brian Austin, Brandon Cook, Jack Deslippe, Krishna Kandalla, Peter Mendygral, "Evaluating the Networking Characteristics of the Cray XC-40 Intel Knights Landing Based Cori Supercomputer at NERSC", Concurrency and Computation: Practice and Experience, Volume 30, Issue 1, September 12, 2017,
Yun (Helen) He, Brandon Cook, Jack Deslippe, Brian Friesen, Richard Gerber, Rebecca Hartman-Baker, Alice Koniges, Thorsten Kurth, Stephen Leak, WooSun Yang, Zhengji Zhao, Eddie Baron, Peter Hauschildt, "Preparing NERSC users for Cori, a Cray XC40 system with Intel Many Integrated Cores", Concurrency and Computation: Practice and Experience, August 2017, 30, doi: 10.1002/cpe.4291
The newest NERSC supercomputer Cori is a Cray XC40 system consisting of 2,388 Intel Xeon Haswell nodes and 9,688 Intel Xeon‐Phi “Knights Landing” (KNL) nodes. Compared to the Xeon‐based clusters NERSC users are familiar with, optimal performance on Cori requires consideration of KNL mode settings; process, thread, and memory affinity; fine‐grain parallelization; vectorization; and use of the high‐bandwidth MCDRAM memory. This paper describes our efforts preparing NERSC users for KNL through the NERSC Exascale Science Application Program, Web documentation, and user training. We discuss how we configured the Cori system for usability and productivity, addressing programming concerns, batch system configurations, and default KNL cluster and memory modes. System usage data, job completion analysis, programming and running jobs issues, and a few successful user stories on KNL are presented.
Taylor A. Barnes, Thorsten Kurth, Pierre Carrier, Nathan Wichmann, David Prendergast, Paul RC Kent, Jack Deslippe, "Improved treatment of exact exchange in Quantum ESPRESSO.", Computer Physics Communications, May 31, 2017,
MeiYue Shao, Lin Lin, Chao Yang, Fang Liu, Felipe H. Da Jornada, Jack Deslippe, Steven G. Louie, "Low rank approximation in G0W0 calculations.", Science China Mathematics, August 1, 2016,
SV Venkatakrishnan, K Aditya Mohan, Keith Beattie, Joaquin Correa, Eli Dart, Jack R Deslippe, Alexander Hexemer, Harinarayan Krishnan, Alastair A MacDowell, Stefano Marchesini, Simon J Patton, Talita Perciano, James A Sethian, Rune Stromsness, Brian L Tierney, Craig E Tull, Daniela Ushizima, Dilworth Y Parkinson, "Making Advanced Scientific Algorithms and Big Scientific Data Management More Accessible", Electronic Imaging, February 14, 2016, 2016 Is.:1,
Meiyue Shao, H Felipe, Chao Yang, Jack Deslippe, Steven G Louie, "Structure preserving parallel algorithms for solving the Bethe–Salpeter eigenvalue problem", Linear Algebra and its Applications, January 1, 2016, 488:148,
Michiel J van Setten, Fabio Caruso, Sahar Sharifzadeh, Xinguo Ren, Matthias Scheffler, Fang Liu, Johannes Lischner, Lin Lin, Jack R Deslippe, Steven G Louie, Chao Yang, Florian Weigend, Jeffrey B Neaton, Ferdinand Evers, Patrick Rinke, "GW 100: Benchmarking G 0 W 0 for molecular systems", Journal of chemical theory and computation, October 22, 2015, 11:5665,
Fang Liu, Lin Lin, Derek Vigil-Fowler, Johannes Lischner, Alexander F. Kemper, Sahar Sharifzadeh, Felipe Homrich da Jornada, Jack Deslippe, Chao Yang, Jeffrey B. Neaton, Steven G. Louie, "Numerical integration for ab initio many-electron self energy calculations within the GW approximation.", Journal of Computational Physics, April 1, 2015,
Jack Deslippe, Brian Austin, Chris Daley, Woo-Sun Yang, "Lessons learned from optimizing science kernels for Intel's "Knights-Corner" architecture", CISE, April 1, 2015,
Manish Jain, Jack Deslippe, Georgy Samsonidze, M.L. Cohen, J.R. Chelikowsky, S.G. Louie, "Improved quasiparticle wavefunctions and mean field for G0W0 calculations: Diagonalization of the static-COHSEX operator", Physical Review B, September 26, 2014,
Johannes Lischner, Sahar Sharifzadeh, Jack Deslippe, J. Neaton, and S. G. Louie, "Effects of Self-consistency and Plasmon-pole Models on GW Calculations for Closed-shell Molecules", Physical Review B, September 17, 2014,
Kin Fai Mak, Felipe H. da Jornada, Keliang He, Jack Deslippe, Nicholas Petrone, James Hone, Jie Shan, Steven G. Louie, and Tony F. Heinz, "Tuning Many-Body Interactions in Graphene: The Effects of Doping on Excitons and Carrier Lifetimes", Physical Review Letters, May 20, 2014, 112:207401,
Sangkook Choi, Jack Deslippe, R.B. Capaz, S.G. Louie, "An explicit formula for optical oscillator strength of excitons in semiconducting single-walled carbon nanotubes: family behavior", Nano Letters, January 9, 2013,
Jack Deslippe, Georgy Samsonidze, Manish Jain, Marvin L Cohen, Steven G Louie, "Coulomb-hole summations and energies for GW calculations with limited number of empty orbitals: a modified static remainder approach", Physical Review B (arXiv preprint arXiv:1208.0266), 2013,
Jack Deslippe, Georgy Samsonidze, David Strubbe, Manish Jain, Marvin L. Cohen, Steven G. Louie, "BerkeleyGW: A Massively Parallel Computer Package for the Calculation of the Quasiparticle and Optical Properties of Materials", Comput. Phys. Comm., 2012,
Johannes Lischner, Jack Deslippe, Manish Jain, Steven G Louie, "First-Principles Calculations of Quasiparticle Excitations of Open-Shell Condensed Matter Systems", Physical Review Letters, 2012, 109:36406,
Kaihui Liu, Jack Deslippe, Fajun Xiao, Rodrigo B Capaz, Xiaoping Hong, Shaul Aloni, Alex Zettl, Wenlong Wang, Xuedong Bai, Steven G Louie, others, "An atlas of carbon nanotube optical transitions", Nature Nanotechnology, 2012, 7:325--329,
Georgy Samsonidze, Manish Jain, Jack Deslippe, Marvin L Cohen, Steven G Louie, "Simple Approximate Physical Orbitals for GW Quasiparticle Calculations", Physical Review Letters, 2011, 107:186404,
David A Siegel, Cheol-Hwan Park, Choongyu Hwang, Jack Deslippe, Alexei V Fedorov, Steven G Louie, Alessandra Lanzara, "Many-body interactions in quasi-freestanding graphene", Proceedings of the National Academy of Sciences, 2011, 108:11365--113,
Li Yang, Jack Deslippe, Cheol-Hwan Park, Marvin L Cohen, Steven G Louie, "Excitonic effects on the optical response of graphene and bilayer graphene", Physical review letters, 2009, 103:186802,
Jack Deslippe, Mario Dipoppa, David Prendergast, Marcus VO Moutinho, Rodrigo B Capaz, Steven G Louie, "Electron-Hole Interaction in Carbon Nanotubes: Novel Screening and Exciton Excitation Spectra", Nano Lett, 2009, 9:1330--1334,
Jack Deslippe, Catalin D Spataru, David Prendergast, Steven G Louie, "Bound excitons in metallic single-walled carbon nanotubes", Nano letters, 2007, 7:1626--1630,
Feng Wang, David J Cho, Brian Kessler, Jack Deslippe, P James Schuck, Steven G Louie, Alex Zettl, Tony F Heinz, Y Ron Shen, "Observation of excitons in one-dimensional metallic single-walled carbon nanotubes", Physical review letters, 2007, 99:227401,
Jack Deslippe, R Tedstrom, Murray S Daw, D Chrzan, T Neeraj, M Mills, "Dynamic scaling in a simple one-dimensional model of dislocation activity", Philosophical Magazine, 2004, 84:2445--2454,
Jianjun Dong, Jack Deslippe, Otto F Sankey, Emmanuel Soignard, Paul F McMillan, "Theoretical study of the ternary spinel nitride system Si 3 N 4-Ge 3 N 4", Physical Review B, 2003, 67:094104,
R. Gayatri, K. Gott, J. Deslippe, "Comparing Managed Memory and ATS with and without Prefetching on NVIDIA Volta GPUs", 2019 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), 2019, 41-46, doi: 10.1109/PMBS49563.2019.00010
One of the major differences in many-core versus multicore architectures is the presence of two different memory spaces: a host space and a device space. In the case of NVIDIA GPUs, the device is supplied with data from the host via one of the multiple memory management API calls provided by the CUDA framework, such as CudaMallocManaged and CudaMemCpy. Modern systems, such as the Summit supercomputer, have the capability to avoid the use of CUDA calls for memory management and access the same data on GPU and CPU. This is done via the Address Translation Services (ATS) technology that gives a unified virtual address space for data allocated with malloc and new if there is an NVLink connection between the two memory spaces. In this paper, we perform a deep analysis of the performance achieved when using two types of unified virtual memory addressing: UVM and managed memory.
C. Yang, R. Gayatri, T. Kurth, P. Basu, Z. Ronaghi, A. Adetokunbo, B. Friesen, B.
Cook, D. Doerfler, L. Oliker, J. Deslippe, and S. Williams,
"An Empirical Roofline Methodology for Quantitatively Assessing Performance Portability",
IEEE International Workshop on Performance, Portability and Productivity in HPC (P3HPC'18),
B. Austin, C. Daley, D. Doerfler, J. Deslippe, B. Cook, B. Friesen, T. Kurth, C. Yang,
and N. Wright,
"A Metric for Evaluating Supercomputer Performance in the Era of Extreme Heterogeneity",
9th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS'18),
B Friesen, MMA Patwary, B Austin, N Satish, Z Slepian, N Sundaram, D Bard, DJ Eisenstein, J Deslippe, P Dubey, Prabhat, "Galactos: Computing the Anisotropic 3-Point Correlation Function for 2 Billion Galaxies", November 2017, doi: 10.1145/3126908.3126927
The nature of dark energy and the complete theory of gravity are two central questions currently facing cosmology. A vital tool for addressing them is the 3-point correlation function (3PCF), which probes deviations from a spatially random distribution of galaxies. However, the 3PCF's formidable computational expense has prevented its application to astronomical surveys comprising millions to billions of galaxies. We present Galactos, a high-performance implementation of a novel, O(N2) algorithm that uses a load-balanced k-d tree and spherical harmonic expansions to compute the anisotropic 3PCF. Our implementation is optimized for the Intel Xeon Phi architecture, exploiting SIMD parallelism, instruction and thread concurrency, and significant L1 and L2 cache reuse, reaching 39% of peak performance on a single node. Galactos scales to the full Cori system, achieving 9.8 PF (peak) and 5.06 PF (sustained) across 9636 nodes, making the 3PCF easily computable for all galaxies in the observable universe.
Thorsten Kurth, William Arndt, Taylor Barnes, Brandon Cook, Jack Deslippe, Doug Doerfler, Brian Friesen, Yun (Helen) He, Tuomas Koskela, Mathieu Lobet, Tareq Malas, Leonid Oliker, Andrey Ovsyannikov, Samual Williams, Woo-Sun Yang, Zhengji Zhao, "Analyzing Performance of Selected NESAP Applications on the Cori HPC System", High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science, Volume 10524, June 22, 2017,
Yun (Helen) He, Brandon Cook, Jack Deslippe, Brian Friesen, Richard Gerber, Rebecca Hartman-Baker, Alice Koniges, Thorsten Kurth, Stephen Leak, WooSun Yang, Zhengji Zhao, Eddie Baron, Peter Hauschildt, "Preparing NERSC users for Cori, a Cray XC40 system with Intel Many Integrated Cores", Cray User Group 2017, Redmond, WA. Best Paper First Runner-Up., May 12, 2017,
- Download File: pap161s2-file1.pdf (pdf: 2.8 MB)
Jialin Liu, Quincey Koziol, Houjun Tang, François Tessier, Wahid Bhimji, Brandon Cook, Brian Austin, Suren Byna, Bhupender Thakur, Glenn K. Lockwood, Jack Deslippe, Prabhat, "Understanding the IO Performance Gap Between Cori KNL and Haswell", Proceedings of the 2017 Cray User Group, Redmond, WA, May 10, 2017,
The Cori system at NERSC has two compute partitions with different CPU architectures: a 2,004 node Haswell partition and a 9,688 node KNL partition, which ranked as the 5th most powerful and fastest supercomputer on the November 2016 Top 500 list. The compute partitions share a common storage configuration, and understanding the IO performance gap between them is important, impacting not only to NERSC/LBNL users and other national labs, but also to the relevant hardware vendors and software developers. In this paper, we have analyzed performance of single core and single node IO comprehensively on the Haswell and KNL partitions, and have discovered the major bottlenecks, which include CPU frequencies and memory copy performance. We have also extended our performance tests to multi-node IO and revealed the IO cost difference caused by network latency, buffer size, and communication cost. Overall, we have developed a strong understanding of the IO gap between Haswell and KNL nodes and the lessons learned from this exploration will guide us in designing optimal IO solutions in many-core era.
Koskela TS, Deslippe J, Friesen B, Raman K, "Fusion PIC code performance analysis on the Cori KNL system", May 2017,
We study the attainable performance of Particle-In-Cell codes on the Cori KNL system by analyzing a miniature particle push application based on the fusion PIC code XGC1. We start from the most basic building blocks of a PIC code and build up the complexity to identify the kernels that cost the most in performance and focus optimization efforts there. Particle push kernels operate at high AI and are not likely to be memory bandwidth or even cache bandwidth bound on KNL. Therefore, we see only minor benefits from the high bandwidth memory available on KNL, and achieving good vectorization is shown to be the most beneficial optimization path with theoretical yield of up to 8x speedup on KNL. In practice we are able to obtain up to a 4x gain from vectorization due to limitations set by the data layout and memory latency.
T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010
Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, Volume 9945, October 6, 2016, doi: 10.1007/978-3-319-46079-6_24
Tareq Malas, Thorsten Kurth, Jack Deslippe, "Optimization of the sparse matrix-vector products of an IDR Krylov iterative solver in EMGeo for the Intel KNL manycore processor", Springer Lecture Notes in Computer Science, October 6, 2016,
Jack, Deslippe, Felipe H. da Jornada, Derek Vigil-Fowler, Taylor Barnes, Nathan Wichmann, Karthik Raman, Ruchira Sasanka, Steven G. Louie, "Optimizing Excited-State Electronic-Structure Codes for Intel Knights Landing: A Case Study on the BerkeleyGW Software.", Springer Lecture Notes in Computer Science (ISC 2016), Springer International Publishing, October 6, 2016, 402,
Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, "OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms", Lecture Notes in Computer Science, Springer, 2016, 9903:17-31, doi: 10.1007/978-3-319-45550-1_2
Dilworth Y. Parkinson, Keith Beattie, Xian Chen, Joaquin Correa, Eli Dart, Benedikt J. Daurer, Jack R. Deslippe, Alexander Hexemer, Harinarayan Krishnan, Alastair A. MacDowell, Filipe R. N. C. Maia, Stefano Marchesini, Howard A. Padmore, Simon J. Patton, Talita Perciano, James A. Sethian, David Shapiro, Rune Stromsness, Nobumichi Tamura, Brian L. Tierney, Craig E. Tull, Daniela Ushizima, "Real-time data-intensive computing.", AIP Conference Proceedings, July 2016, 1741,
Jack Deslippe, Abdelilah Essiari, Simon J. Patton, Taghrid Samak, Craig E. Tull, Alexander Hexemer, Dinesh Kumar, Dilworth Parkinson, Polite Stewart., "Workflow management for real-time analysis of lightsource experiments", Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science (SC14), November 16, 2014, 31-40,
Justin Blair, Richard S. Canon, Jack Deslippe, Abdelilah Essiari, Alexander Hexemer, Alastair A. MacDowell, Dilworth Y. Parkinson, Simon J. Patton, Lavanya Ramakrishnan, Nobumichi Tamura, Brian L. Tierney, Craig E. Tull, "High performance data management and analysis for tomography", Proc. SPIE 9212, Developments in X-Ray Tomography IX, September 12, 2014,
Jack Deslippe, Zhengji Zhao, "Comparing Compiler and Library Performance in Material Science Applications on Edison", Paper. Proceedings of the Cray User Group 2013, 2013,
Megan Bowling, Zhengji Zhao and Jack Deslippe, "The Effects of Compiler Optimizations on Materials Science and Chemistry Applications at NERSC", A paper presented in the Cray User Group meeting, Apri 29-May-3, 2012, Stuttgart, German., May 3, 2012,
Jack Deslippe, Steven G Louie, "Excitons and many-electron effects in the optical response of carbon nanotubes and other one-dimensional nanostructures", Proceedings of SPIE, the International Society for Optical Engineering, 2008, 68920U--1,
Jack Deslippe, Doug Doerfler, Brandon Cook, Tareq Malas, Samuel Williams, Sudip Dosanjh, "Optimizing Science Applications for the Cori, Knights Landing, System at NERSC", Advances in Parallel Computing, Volume 30: New Frontiers in High Performance Computing and Big Data, ( January 1, 2017)
Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)
Jack Deslippe, S.G. Louie, "Ab initio Theories of the Structural, Electronic, and Optical Properties of Semiconductors: Bulk Systems to Nanostructures.", Comprehensive Semiconductor Science and Technology., (Elsevier: 2011) Pages: 42-76
Brandon Cook, Jack Deslippe, Jonathan Madsen, Kevin Gott, Muaaz Awan, Enabling 800 Projects for GPU-Accelerated Science on Perlmutter at NERSC, GTC 2020, 2020,
The National Energy Research Scientific Computing Center (NERSC) is the mission HPC center for the U.S. Department of Energy Office of Science and supports the needs of 800+ projects and 7,000+ scientists with advanced HPC and data capabilities. NERSC’s newest system, Perlmutter, is an upcoming Cray system with heterogeneous nodes including AMD CPUs and NVIDIA Volta-Next GPUs. It will be the first NERSC flagship system with GPUs. Preparing our diverse user base for the new system is a critical part of making the system successful in enabling science at scale. The NERSC Exascale Science Application Program is responsible for preparing the simulation, data, and machine learning workloads to take advantage of the new architecture. We'll outline our strategy to enable our users to take advantage of the new architecture in a performance-portable way and discuss early outcomes. We'll highlight our use of tools and performance models to evaluate application readiness for Perlmutter and how we effectively frame the conversation about GPU optimization with our wide user base. In addition, we'll highlight a number of activities we are undertaking in order to make Perlmutter a more productive system when it arrives through compiler, library, and tool development. We'll also cover outcomes from a series of case studies that demonstrate our strategy to enable users to take advantage of the new architecture. We'll discuss the programming model used to port codes to GPUs, the strategy used to optimize code bottlenecks, and the GPU vs. CPU speedup achieved so far. The codes will include Tomopy (tomographic reconstruction), Exabiome (genomics de novo assembly), and AMReX (Adaptive Mesh Refinement software framework).
Thorsten Kurth, Joshua Romero, Everett Phillips, and Massimiliano Fatica, Brandon Cook, Rahul Gayatri, Zhengji Zhao, and Jack Deslippe, Porting Quantum ESPRESSO Hybrid Functional DFT to GPUs Using CUDA Fortran, Cray User Group Meeting, Montreal, Canada, May 5, 2019,
Kevin Gott, Charles Lena, Ariel Biller, Josh Neitzel, Kai-Hsin Liou, Jack Deslippe, James R Chelikowsky, Scaling and optimization results of the real-space DFT solver PARSEC on Haswell and KNL systems, Intel Xeon Phi Users Group (IXPUG), 2017, 2018,
Richard A. Gerber, Jack Deslippe, Manycore for the Masses Part 2, Intel HPC DevCon, November 11, 2017,
- Download File: Many-Cores-For-The-Masses-Part-2-v2.pdf (pdf: 12 MB)
Yun (Helen) He, Jack Deslippe, Enabling Applications for Cori KNL: NESAP, September 21, 2017,
- Download File: 02-EnablingUsers-NESAP-NUG2017.pdf (pdf: 2.7 MB)
R Gerber, J Deslippe, D Doerfler, Many Cores for the Masses: Lessons Learned from Application Readiness Efforts at NERSC for the Knights Landing based Cori System, Intel HPC Developers Conference, November 12, 2016,
- Download File: NESAP-HPC-DevCon.pdf (pdf: 9.2 MB)
Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi, OpenMP Parallelization and Optimization of Graph-based Machine Learning Algorithms, IWOMP 2016, October 6, 2016,
- Download File: OpenMP-Parallelization-and-Optimization-of-Graph-based-Machine-Learning-Algorithms.pdf (pdf: 10 MB)
Jack Deslippe, NERSC, Preparing Applications for Future NERSC Architectures, February 6, 2014,
- Download File: Application-Readiness-NUG2014.pdf (pdf: 8.6 MB)
Jack Deslippe, Building Applications on Edison, October 10, 2013,
- Download File: Building-Applications-on-Edison.pdf (pdf: 3.1 MB)