NERSCPowering Scientific Discovery for 50 Years

Sudip Dosanjh

sudip-headshot-a_2.jpg
Sudip Dosanjh, Ph.D.
NERSC Division Director
Phone: (510) 495-2488*
Fax: +1 (510) 486-6459
1 Cyclotron Road
Mailstop: 59R4010A
Berkeley, CA 94720 us

* Contact for appointments: Zaida McCunney Phone: 510.486.6247  |  Email:  ZSMcCunney@lbl.gov

Biographical Sketch

Dr. Sudip Dosanjh is Director of the National Energy Research Scientific Computing (NERSC) Center at Lawrence Berkeley National Laboratory. NERSC's mission is to accelerate scientific discovery at the U.S. Department of Energy's Office of Science through high performance computing and extreme data analysis. NERSC deploys leading-edge computational and data resources for over 4,500 users from a broad range of disciplines. NERSC will be partnering with computer companies to develop and deploy pre-exascale and exascale systems during the next decade.

Previously, Dr. Dosanjh headed extreme-scale computing at Sandia National Laboratories. He was co-director of the Los Alamos/Sandia Alliance for Computing at the Extreme-Scale from 2008-2012. He also served on the U.S. Department of Energy's Exascale Initiative Steering Committee for several years.

Dr. Dosanjh had a key role in establishing co-design as a methodology for reaching exascale computing. He has numerous publications on exascale computing, co-design, computer architectures, massively parallel computing and computational science.

He earned his bachelor’s degree in engineering physics in 1982, his master’s degree (1984) and Ph.D. (1986) in mechanical engineering, all from the University of California, Berkeley.

Select Recent Publications

Journal Articles

C. S. Daley, D. Ghoshal, G. K. Lockwood, S. Dosanjh, L. Ramakrishnan, N. J. Wright, "Performance characterization of scientific workflows for the optimal use of Burst Buffers", Future Generation Computer Systems, December 28, 2017, doi: 10.1016/j.future.2017.12.022

Scientific discoveries are increasingly dependent upon the analysis of large volumes of data from observations and simulations of complex phenomena. Scientists compose the complex analyses as workflows and execute them on large-scale HPC systems. The workflow structures are in contrast with monolithic single simulations that have often been the primary use case on HPC systems. Simultaneously, new storage paradigms such as Burst Buffers are becoming available on HPC platforms. In this paper, we analyze the performance characteristics of a Burst Buffer and two representative scientific workflows with the aim of optimizing the usage of a Burst Buffer, extending our previous analyses (Daley et al., 2016). Our key contributions are a). developing a performance analysis methodology pertinent to Burst Buffers, b). improving the use of a Burst Buffer in workflows with bandwidth-sensitive and metadata-sensitive I/O workloads, c). highlighting the key data management challenges when incorporating a Burst Buffer in the studied scientific workflows.

S.S. Dosanjh, R.F. Barrett, D.W. Doerfler, S.D. Hammond, K.S. Hemmert, M.A. Heroux, P.T. Lin, K.T. Pedretti, A.F. Rodrigues, T.G. Trucano, J.P. Juitjens, "Exascale Design Space Exploration and Co-Design", Future Generation Computer Systems, Volume 30, Pages 46-58, January 2014,

J. Dongarra et al., "The International Exascale Software Project Roadmap", International Journal of High Performance Computing Applications, 25:1, 2011,

K. Alvin, B. Barrett, R. Brightwell, S. Dosanjh, A. Geist, S. Hemmert, M. Heroux, D. Kothe, R. Murphy, J. Nichols, R. Oldfield, A. Rodrigues, J. Vetter, "On the Path to Exascale", International Journal of Distributed Systems and Technologies, 1(2):1– 22, May 22, 2010,

J. Tomkins, R. Brightwell, W. Camp, S. Dosanjh, S. Kelly, P. Lin, C. Vaughan, J. Levesque, V. Tipparaju, "The Red Storm Architecture and Early Experiences with Multi-Core Processors", International Journal of Distributed Systems and Technologies, Vol. 1, Issue 2, pp. 74-93, April 19, 2010, doi: 10.4018/jdst.2010040105

A. Geist, S. Dosanjh, "IESP Exascale Challenge: Co-Design of Architectures and Algorithms", International Journal of High Performance Computing, Vol. 23, No. 4, pp. 401–402, September 18, 2009,

Conference Papers

C.S. Daley, D. Ghoshal, G.K. Lockwood, S. Dosanjh, L. Ramakrishnan, N.J. Wright, "Performance Characterization of Scientific Workflows for the Optimal Use of Burst Buffers", Workflows in Support of Large-Scale Science (WORKS-2016), CEUR-WS.org, 2016, 1800:69-73,

C.S. Daley, L. Ramakrishnan, S. Dosanjh, N.J. Wright, "Analyses of Scientific Workflows for Effective Use of Future Architectures", The 6th International Workshop on Big Data Analytics: Challenges, and Opportunities (BDAC-15), 2015,

R. Barrett, S. Dosanjh, et al., "Towards Codesign in High Performance Computing Systems", IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, November 5, 2012,

D. Doerfler, S. Dosanjh, J. Morrison, M. Vigil, "Production Petascale Computing", Cray Users Group Meeting, Fairbanks, Alaska, 2011,

S. Hu, R. Murphy, S. Dosanjh, K. Olukoton, S. Poole, "Hardware/Software Co- Design for High Performance Computing", Proceedings of CODES+ISSS’10, October 24, 2010,

A. Rodrigues, S. Dosanjh, S. Hemmert, "Co-Design for High Performance Computing", Proceedings of the International Conference on Numerical Analysis and Applied Mathematics, Rhodes, Greece, September 18, 2010,

J. Ang, D. Doerfler, S. Dosanjh, K. Koch, J. Morrison, M. Vigil, "The Alliance for Computing at the Extreme Scale", Proceedings of the Cray Users Group Meeting, Edinburgh, Scotland, May 24, 2010,

Book Chapters

Jack Deslippe, Doug Doerfler, Brandon Cook, Tareq Malas, Samuel Williams, Sudip Dosanjh, "Optimizing Science Applications for the Cori, Knights Landing, System at NERSC", Advances in Parallel Computing, Volume 30: New Frontiers in High Performance Computing and Big Data, ( January 1, 2017)

N.J. Wright, S. S. Dosanjh, A. K. Andrews, K. Antypas, B. Draney, R.S Canon, S. Cholia, C.S. Daley, K. M. Fagnan, R.A. Gerber, L. Gerhardt, L. Pezzaglia, Prabhat, K.H. Schafer, J. Srinivasan, "Cori: A Pre-Exascale Computer for Big Data and HPC Applications", Big Data and High Performance Computing 26 (2015): 82., ( June 2015) doi: 10.3233/978-1-61499-583-8-82

Extreme data science is becoming increasingly important at the U.S. Department of Energy's National Energy Research Scientific Computing Center (NERSC). Many petabytes of data are transferred from experimental facilities to NERSC each year. Applications of importance include high-energy physics, materials science, genomics, and climate modeling, with an increasing emphasis on large-scale simulations and data analysis. In response to the emerging data-intensive workloads of its users, NERSC made a number of critical design choices to enhance the usability of its pre-exascale supercomputer, Cori, which is scheduled to be delivered in 2016. These data enhancements include a data partition, a layer of NVRAM for accelerating I/O, user defined images and a customizable gateway for accelerating connections to remote experimental facilities.

Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, Nicholas J. Wright, "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)

Richard A. Barrett, Shekhar Borkar, Sudip S. Dosanjh, Simon D. Hammond, Michael A. Heroux, X. Sharon Hu, Justin Luitjens, Steven G. Parker, John Shalf, Li Tang, "On the Role of Co-design in High Performance Computing", Transition of HPC Towards Exascale Computing, E.H. D'Hollander et. al (Eds.), IOS Press, 2013, ( November 1, 2013) doi: 10.3233/978-1-61499-324-7-141

J. Ang, R. Brightwell, S. Dosanjh, et al., "Exascale Computing and the Role of Co-Design", ( 2011)

John Shalf, S. Dosanjh, John Morrison, "Exascale Computing Technology Challenges", VECPAR, ( 2010) Pages: 1-25

High Performance Computing architectures are expected to change dramatically in the next decade as power and cooling constraints limit increases in microprocessor clock speeds. Consequently computer companies are dramatically increasing on-chip parallelism to improve performance. The traditional doubling of clock speeds every 18-24 months is being replaced by a doubling of cores or other parallelism mechanisms. During the next decade the amount of parallelism on a single microprocessor will rival the number of nodes in early massively parallel supercomputers that were built in the 1980s. Applications and algorithms will need to change and adapt as node architectures evolve. In particular, they will need to manage locality to achieve performance. A key element of the strategy as we move forward is the co-design of applications, architectures and programming environments. There is an unprecedented opportunity for application and algorithm developers to influence the direction of future architectures so that they meet DOE mission needs. This article will describe the technology challenges on the road to exascale, their underlying causes, and their effect on the future of HPC system design.

Reports

Rolf Riesen, Sudip Dosanjh, Larry Kaplan, "The ExaChallenge Symposium", IBM Research Paper, August 26, 2013,

R. Sevens, A. White, S. Dosanjh, et al., "Scientific Grand Challenges: Architectures and Technology for Extreme-Scale Computing Report", 2011,

R. Leland and S. Dosanjh, "Computing at Exascale: A Value Proposition", Sandia National Laboratories Report, November 16, 2010,

Others

Manuel Vigil, Douglas Doerfler, Sudip Dosanjh, John Morrison, 2010 Defense Programs Award of Excellence for Significant Contributions to the Stockpile Stewardship Program, Successful Deployment of Cielo Petascale Supercomputer, National Nuclear Security Administration, April 2011,