NERSCPowering Scientific Discovery Since 1974

Wahid Bhimji

Wahid Bhimji
Data Architect
Data & Analytics Services
Phone: (510) 486-4710
Fax: (510) 486-6459
1 Cyclotron Road
Mail Stop: 59R4010A
Berkeley, CA 94720 US

Biographical Sketch

Wahid is a Big Data Architect in the Data and Analytics Services team at NERSC. His current interests include machine learning, databases and data management. Wahid has worked for many years in Scientific Computing and Data Analysis in Academia and Government and has a Ph.D. in High-Energy Particle Physics.


Journal Articles

T. Maier, D. Benjamin, W. Bhimji, Elmsheuser, P. van Gemmeren, D. Malon, N. Krumnack, "ATLAS I/O performance optimization in as-deployed environments", J. Phys. Conf. Ser., 2015, 664:042033, doi: 10.1088/1742-6596/664/4/042033 ,

Michela Massimi, Wahid Bhimji, "Computer simulations and experiments: The case of the Higgs boson", Stud. Hist. Philos. Mod. Phys., 2015, 51:71-81, doi: 10.1016/j.shpsb.2015.06.003

Georges Aad, others (ATLAS and CMS Collaborations), "Combined Measurement of the Higgs Boson Mass in pp collisions at sqrt{s}=7 and 8 TeV with the ATLAS and CMS Experiments", Phys. Rev. Lett., 2015, 114:191803, doi: 10.1103/PhysRevLett.114.191803

Georges Aad, others (ATLAS Collaboration), "Identification of Boosted, Hadronically Decaying W and Comparisons with ATLAS Data Taken at sqrt(s) = 8 TeV", Submitted to Eur. Phys. J. C, 2015,

Mustafa Mustafa, Deborah Bard, Wahid Bhimji, Rami Al-Rfou, Zarija Lukić, "Creating Virtual Universes Using Generative Adversarial Networks", Submitted To Sci. Rep., December 31, 1969,

Conference Papers

Wahid Bhimji, Debbie Bard, Kaylan Burleigh, Chris Daley, Steve Farrell, Markus Fasel, Brian Friesen, Lisa Gerhardt, Jialin Liu, Peter Nugent, Dave Paul, Jeff Porter, Vakho Tsulaia, "Extreme I/O on HPC for HEP using the Burst Buffer at NERSC", Journal of Physics: Conference Series, December 1, 2017, 898:082015,

Jialin Liu, Quincey Koziol, Houjun Tang, François Tessier, Wahid Bhimji, Brandon Cook, Brian Austin, Suren Byna, Bhupender Thakur, Glenn K. Lockwood, Jack Deslippe, Prabhat, "Understanding the IO Performance Gap Between Cori KNL and Haswell", Proceedings of the 2017 Cray User Group, Redmond, WA, May 10, 2017,

The Cori system at NERSC has two compute partitions with different CPU architectures: a 2,004 node Haswell partition and a 9,688 node KNL partition, which ranked as the 5th most powerful and fastest supercomputer on the November 2016 Top 500 list. The compute partitions share a common storage configuration, and understanding the IO performance gap between them is important, impacting not only to NERSC/LBNL users and other national labs, but also to the relevant hardware vendors and software developers. In this paper, we have analyzed performance of single core and single node IO comprehensively on the Haswell and KNL partitions, and have discovered the major bottlenecks, which include CPU frequencies and memory copy performance. We have also extended our performance tests to multi-node IO and revealed the IO cost difference caused by network latency, buffer size, and communication cost. Overall, we have developed a strong understanding of the IO gap between Haswell and KNL nodes and the lessons learned from this exploration will guide us in designing optimal IO solutions in many-core era.

Evan Racah, Seyoon Ko, Peter Sadowski, Wahid Bhimji, Craig Tull, Sang-Yun Oh, Pierre Baldi, Prabhat, "Revealing Fundamental Physics from the Daya Bay Neutrino Experiment using Deep Neural Networks", ICMLA, 2016,

Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,

W. Bhimji, D. Bard, M. Romanus, D. Paul, A. Ovsyannikov, B. Friesen, M. Bryson, J. Correa, G. K. Lockwood, V. Tsulaia, S. Byna, S. Farrell, D. Gursoy, C. Daley, V. Beckner, B. Van Straalen, D. Trebotich, C. Tull, G. Weber, N. J. Wright, K. Antypas, Prabhat, "Accelerating Science with the NERSC Burst Buffer Early User Program", Cray User Group, May 11, 2016, LBNL LBNL-1005736,

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.

Mostofa Patwary, Nadathur Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey, "PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures", IPDPS 2016, April 5, 2016,


Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon, Shreyas Cholia, Helen (Yun) He, Douglas Jacobsen, Prabhat, Nicholas J. Wright, Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,

Yun (Helen) He, Wahid Bhimji, Cori: User Update, NERSC User Group Meeting, March 24, 2016,


Glenn K. Lockwood, Damian Hazen, Quincey Koziol, Shane Canon, Katie Antypas, Jan Balewski, Nicholas Balthaser, Wahid Bhimji, James Botts, Jeff Broughton, Tina L. Butler, Gregory F. Butler, Ravi Cheema, Christopher Daley, Tina Declerck, Lisa Gerhardt, Wayne E. Hurlbert, Kristy A. Kallback-
Rose, Stephen Leak, Jason Lee, Rei Lee, Jialin Liu, Kirill Lozinskiy, David Paul, Prabhat, Cory Snavely, Jay Srinivasan, Tavia Stone Gibbins, Nicholas J. Wright,
"Storage 2020: A Vision for the Future of HPC Storage", October 20, 2017, LBNL LBNL-2001072,

As the DOE Office of Science's mission computing facility, NERSC will follow this roadmap and deploy these new storage technologies to continue delivering storage resources that meet the needs of its broad user community. NERSC's diversity of workflows encompass significant portions of open science workloads as well, and the findings presented in this report are also intended to be a blueprint for how the evolving storage landscape can be best utilized by the greater HPC community. Executing the strategy presented here will ensure that emerging I/O technologies will be both applicable to and effective in enabling scientific discovery through extreme-scale simulation and data analysis in the coming decade.


Debbie Bard, Wahid Bhimji, David Paul, Glenn K. Lockwood, Nicholas J Wright, Katie Antypas, Prabhat, Steve Farrell, Andrey Ovsyannikov, Melissa Romanus, Brian Van Straalen, David Trebotich, Guenter Weber, "Experiences with the Burst Buffer at NERSC", Supercomputing Conference, November 16, 2016,

Annette Greiner, Evan Racah, Shane Canon, Jialin Liu, Yunjie Liu, Debbie Bard, Lisa Gerhardt, Rollin Thomas, Shreyas Cholia, Jeff Porter, Wahid Bhimji, Quincey Koziol, Prabhat, "Data-Intensive Supercomputing for Science", Berkeley Institute for Data Science (BIDS) Data Science Faire, May 3, 2016,

Review of current DAS activities for a non-NERSC audience.