NERSCPowering Scientific Discovery Since 1974

Doug Doerfler

Doug Doerfler
HPC Architecture and Performance Engineer
Advanced Technologies Group
National Energy Research Scientific Computing Center
Lawrence Berkeley National Laboratory
1 Cyclotron Road, Mailstop: 59-4010A
Berkeley, CA 94720 US

Biographical Sketch

Doug is a member of the Advanced Technologies Group (ATG) and specializes in high-performance computing architectures, benchmarking and performance analysis. Prior to joining Lawrence Berkeley National Lab, he was at Sandia National Laboratories in Albuquerque, New Mexico. He earned his bachelor and master degrees in electrical engineering from Kansas State University.


Journal Articles

Douglas Doerfler, Brian Austin, Brandon Cook, Jack Deslippe, Krishna Kandalla, Peter Mendygral, "Evaluating the Networking Characteristics of the Cray XC-40 Intel Knights Landing Based Cori Supercomputer at NERSC", Concurrency and Computation: Practice and Experience, Volume 30, Issue 1, September 12, 2017,

Richard F. Barrett, Paul Crozier, Douglas W. Doerfler, Michael A. Heroux, Paul Lin, Heidi K. Thornquist, Timothy G. Trucano, Courtenay T. Vaughan, "Assessing the Role of Mini-Applications in Predicting Key Performance Characteristics of Scientific and Engineering Applications", Journal of Parallel and Distributed Computing, Volume 75, Pages 107-122, January 2015,

S.S. Dosanjh, R.F. Barrett, D.W. Doerfler, S.D. Hammond, K.S. Hemmert, M.A. Heroux, P.T. Lin, K.T. Pedretti, A.F. Rodrigues, T.G. Trucano, J.P. Juitjens, "Exascale Design Space Exploration and Co-Design", Future Generation Computer Systems, Volume 30, Pages 46-58, January 2014,

Mahesh Rajan, Courtenay T. Vaughan, Doug W. Doerfler, Richard F. Barrett, Kevin T. Pedretti, Karl S. Hemmert, "Application-driven Analysis of Two Generations of Capability Computing Platforms: Purple and Cielo", Computation and Concurrency: Practice and Experience, Volume 24, Issue 18, March 2012,

Mahesh Rajan, Douglas Doerfler, Courtenay T. Vaughan, Marcus Epperson, Jeff Ogden, "Application Performance on the Tri-Lab Linux Capacity Cluster - TLCC", International Journal of Distributed Systems and Technologies, Volume 1, Issue 2, April 2010,

Conference Papers

C. Yang, R. Gayatri, T. Kurth, P. Basu, Z. Ronaghi, A. Adetokunbo, B. Friesen, B.
Cook, D. Doerfler, L. Oliker, J. Deslippe, and S. Williams,
"An Empirical Roofline Methodology for Quantitatively Assessing Performance Portability", IEEE International Workshop on Performance, Portability and Productivity in HPC (P3HPC'18), November 2018,

B. Austin, C. Daley, D. Doerfler, J. Deslippe, B. Cook, B. Friesen, T. Kurth, C. Yang,
and N. Wright,
"A Metric for Evaluating Supercomputer Performance in the Era of Extreme Heterogeneity", 9th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS'18), November 2018,

Tyler Allen, Christopher S. Daley, Douglas Doerfler, Brian Austin, Nicholas J. Wright, "Performance and Energy Usage of Workloads on KNL and Haswell Architectures", High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation. PMBS 2017. Lecture Notes in Computer Science, Volume 10724., December 23, 2017,

Thorsten Kurth, William Arndt, Taylor Barnes, Brandon Cook, Jack Deslippe, Doug Doerfler, Brian Friesen, Yun (Helen) He, Tuomas Koskela, Mathieu Lobet, Tareq Malas, Leonid Oliker, Andrey Ovsyannikov, Samual Williams, Woo-Sun Yang, Zhengji Zhao, "Analyzing Performance of Selected NESAP Applications on the Cori HPC System", High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science, Volume 10524, June 22, 2017,

T. Barnes, B. Cook, J. Deslippe, D. Doerfler, B. Friesen, Y.H. He, T. Kurth, T. Koskela, M. Lobet, T. Malas, L. Oliker, A. Ovsyannikov, A. Sarje, J.-L. Vay, H. Vincenti, S. Williams, P. Carrier, N. Wichmann, M. Wagner, P. Kent, C. Kerr, J. Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", PMBS 2016: 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Supercomputing Conference, Salt Lake City, UT, USA, IEEE, November 13, 2016, LBNL LBNL-1006681, doi: 10.1109/PMBS.2016.010

Carleton DeTar, Douglas Doerfler, Steven Gottlieb, Ashish Jha, Dhiraj Kalamkar, Ruizi Li, Doug Toussaint, "MILC staggered conjugate gradient performance on Intel KNL", 34th International Symposium on Lattice Field Theory (Lattice 2016), Southampton, UK, November 3, 2016,

Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, Volume 9945, October 6, 2016, doi: 10.1007/978-3-319-46079-6_24

Mahesh Rajan, Doug Doerfler, Mike Tupek, Si Hammond, "An Investigation of Compiler Vectorization on Current and Next-generation Intel Processors using Benchmarks and Sandia’s SIERRA Applications", Cray User Group (CUG) 2015, April 2015,

M. J. Cordery, B. Austin, H. J. Wasserman, C. S. Daley, N. J. Wright, S. D. Hammond, D. Doerfler, "Analysis of Cray XC30 Performance using Trinity-NERSC-8 benchmarks and comparison with Cray XE6 and IBM BG/Q", High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation (PMBS 2013). Lecture Notes in Computer Science, Volume 8551, October 1, 2014,

Mahesh Rajan, Douglas W. Doerfler, Richard Frederick Barrett, Joel O. Stevenson, Anthony Michael Agelastos, Ryan Phillip Shaw, Harold Edward Meyer, "Experiences with Sandia National Laboratories HPC applications and MPI Performance", MVAPICH Users Group Meeting, August 2014,

Richard F. Barrett, Simon D. Hammond, Courtenay T. Vaughan, Doug W. Doerfler, Michael A. Heroux, Justin P. Luitjens, Duncan Roweth, "Navigating An Evolutionary Fast Path to Exascale", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12), November 2012,

Mahesh Rajan, Douglas W. Doerfler, Paul T. Lin, Simon D. Hammond, Richard F. Barrett, Courtney T. Vaughan, "Unprecedented Scalability and Performance of the New NNSA Tri-Lab Linux Capacity Cluster 2", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12), November 2012,

Kevin Pedretti, Ron Brightwell, Doug Doerfler, K. Scott Hemmert, James H. Laros, III, "The Impact of Injection Bandwidth Performance on Application Scalability", EuroMPI 2011. Lecture Notes in Computer Science, Volume 6960, September 2011,

D. Doerfler, S. Dosanjh, J. Morrison, M. Vigil, "Production Petascale Computing", Cray Users Group Meeting, Fairbanks, Alaska, 2011,

Courtenay T. Vaughan, Mahesh Rajan, Douglas W. Doerfler, Richard F. Barrett, Kevin Pedretti, "Investigating the Impact of the Cielo Cray XE6 Architecture on Scientific Application Codes", IPDPS 2011 International Workshop on Large-Scale Parallel Processing (LSPP'11), May 2011,

Douglas Doerfler, Mahesh Rajan, Cindy Nuss, Cornell Wright, Tom Spelce, "Application-Driven Acceptance of Cielo, an XE6 Petascale Capability Platform", Cray User Group (CUG) 2011, May 2011,

Mahesh Rajan, Douglas Doerfler, "HPC application performance and scaling: understanding trends and future challenges with application benchmarks on past, present and future Tri-Lab computing systems", 8th International Conference of Numerical Analysis and Applied Mathematics, September 17, 2010,

J. Ang, D. Doerfler, S. Dosanjh, K. Koch, J. Morrison, M. Vigil, "The Alliance for Computing at the Extreme Scale", Proceedings of the Cray Users Group Meeting, Edinburgh, Scotland, May 24, 2010,

Courtenay Vaughan, Douglas Doerfler, "Analyzing Multicore Characteristics for a Suite of Applications on an XT5 System", Cray User Group (CUG) 2010, May 2010,

Mahesh Rajan, Douglas W Doerfler, Courtenay T Vaughan, "Red Storm/Cray XT4: A Superior Architecture for Scalability", Cray User Group (CUG) 2009, May 2009,

Brian J. Martin, Andrew J. Leiker, James H. Laros, III, Douglas W. Doerfler, "Performance Analysis of the SiCortex SC092", The 10th LCI International Conference on High-Performance Clustered Computing, March 2009,

Mahesh Rajan, Courtenay T Vaughan, Robert W Leland, Douglas W Doerfler, Robert E Benner, Jr., "Investigating the balance between capacity and capability workloads across large scale computing platforms", 9th LCI International Conference on High-Performance Computing, April 2008,

Douglas Doerfler, David Hensinger, Brent Leback, Douglas Miles, "Tuning C++ Applications for the Latest Generation x64 Processors with PGI Compilers and Tools", Cray User Group (CUG) 2007, May 2007,

Ron Brightwell, Douglas Doerfler, "Measuring MPI Send and Receive Overhead and Application Availability in High Performance Network Interfaces", EuroPVM/MPI 2006. Lecture Notes in Computer Science. Volume 4192, September 2006,

Ron Brightwell, Douglas Doerfler, Keith D Underwood, "A Preliminary Analysis of the InfiniPath and XD1 Network Interfaces", Proceedings 20th IEEE International Parallel & Distributed Processing Symposium: Workshop on Communication Architecture for Clusters, April 2006,

Douglas W. Doerfler, Courtenay T. Vaughan, "Characterizing Compiler Performance for the AMD Opteron Processor on a Parallel Platform", Cray User Group (CUG) 2005, May 2005,

Ron B. Brightwell, Douglas W. Doerfler, Keith D. Underwood, "A Comparison of 4X Infiniband and Quadrics Elan-4 Technologies", 2004 IEEE International Conference on Cluster Computing (Cluster 2004), September 2004,

Book Chapters

Jack Deslippe, Doug Doerfler, Brandon Cook, Tareq Malas, Samuel Williams, Sudip Dosanjh, "Optimizing Science Applications for the Cori, Knights Landing, System at NERSC", Advances in Parallel Computing, Volume 30: New Frontiers in High Performance Computing and Big Data, ( January 1, 2017)


Douglas Doerfler, Steven Gottlieb, Carleton DeTar, Doug Toussaint, Karthik Raman, Improving the Performance of the MILC Code on Intel Knights Landing, An Overview, Intel Xeon Phi User Group Meeting 2017 Fall Meeting, September 26, 2017,

R Gerber, J Deslippe, D Doerfler, Many Cores for the Masses: Lessons Learned from Application Readiness Efforts at NERSC for the Knights Landing based Cori System, Intel HPC Developers Conference, November 12, 2016,

R Li, C DeTar, D Doerfler, S Gottlieb, A Jha, D Kalamakar, D Toussaint, Porting the MIMD Lattice Computation (MILC) Code to the Intel Xeon Phi Knights Landing Processor, ISC High Performance 2016 International Workshops: Application Performance on Intel Xeon Phi – Being Prepared for KNL & Beyond, June 23, 2016,

D Doerfler, Understanding Application Data Movement Characteristics using Intel’s VTune Amplifier and Software Development Emulator Tools, Intel Xeon Phi Users Group (IXPUG) 2015, Annual Meeting, September 30, 2015,

Douglas Doerfler, First Experiences with 64-bit ARM Moonshot, HP-CAST 23, November 2014,

Douglas Doerfler, Dr. Tom Bradicich, An Evaluation of 64-bit ARM for use in High-Performance Modeling and Simulation Architecture, ARM TechCon 2014, October 2014,

Douglas Doerfler, The Role of Advanced Technology Systems in the ASC Platform Strategy, Salishan Conference on High-Speed Computing, April 2014,

Douglas Doerfler, Trinity: Next-Generation Supercomputer for the ASC Program, HPC User Forum, April 1, 2014,

Douglas W. Doerfler, Analyzing the Application Performance Impact of Using High-Speed Inter-Socket Communication Networks, Workshop on The Influence of I/O on Microprocessor Architecture (IOM-2009), February 2009,


Robert Leland, Mahesh Rajan, Michael A. Heroux, Douglas W. Doerfler, "Performance, Efficiency, and Effectiveness of Supercomputers", Sandia National Laboratories, Sandia Report SAND2016-3730, September 2016,


Richard Barrett, Paul Crozier, Doug Doerfler, Simon Hammond, Mike Heroux, Paul Lin, Tim Trucano, Courtenay Vaughan, Alan Williams, "Assessing the Predictive Capabilities of Mini-applications", The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), November 2012,

Richard F Barrett, Courtenay T Vaughan, Mahesh Rajan, Douglas W Doerfler, "From Red Storm to Cielo: Performance Analysis of ASC Simulation Programs Across an Evolution of Multicore Architectures", The ACM/IEEE Conference on High Performance Networking and Computing (SC10), November 2010,


Michael A. Heroux, Richard Frederick Barrett, James Michael Willenbring, Daniel W Barnette, David Beckingsale, James F Belak, Mike Boulton, Paul Crozier, Douglas W. Doerfler, Harold C. Edwards, Wayne Gaudin, Timothy C Germann, Simon David Hammond, Andy Herdman, Stephen Jarvis, Paul Lin, Justin Luitjens, Andrew Mallinson, Simon McIntosh-Smith, Susan M Mniszewski, Jamaludin Mohd-Yusof, David F Richards, Christopher Sewell, Sriram Swaminarayan, Heidi K. Thornquist, Christian Robert Trott, Courtenay T. Vaughan, Alan B. Williams, R&D 100 Award, Mantevo Suite 1.0, R&D Magazine, August 2013,

Douglas W. Doerfler, ASC Salutes, National Nuclear Security Administration Advanced Simulation & Computing Program Office, June 2011,

Manuel Vigil, Douglas Doerfler, Sudip Dosanjh, John Morrison, 2010 Defense Programs Award of Excellence for Significant Contributions to the Stockpile Stewardship Program, Successful Deployment of Cielo Petascale Supercomputer, National Nuclear Security Administration, April 2011,

William J. Camp, Robert A. Ballance, Linda R. Bonnefoy-Lev, Ronald B. Brightwell, Douglas W. Doerfler, James L. Handrock, Karen L. Jefferson, Suzanne M. Kelly, James H. Laros III, Robert W. Leland, Michael J. Levenhagen, John J. Naegle, John P. Noe, Kevin T. Pedretti, Mahesh Rajan, Leonard Stands, Judy E. Sturtevant, James L. Tomkins, Keith D. Underwood, John P. Van Dyke, Courtenay T. Vaughan, H. Lee Ward, David R. White, John D. Zepper, Lockheed Martin Nova Award, Red Storm Supercomputer Design and Development Team, Lockheed Martin Corporation, October 2006,