Rahulkumar Gayatri is an HPC Programming Model Architect working in the Programming Environments and Models (PEM) group at NERSC. He is one of the core developers of the Kokkos programming framework. He is also involved with the EXAALT and the HIP-LZ projects.
In his time as a NERSC staff member, Rahul optimized the SNAP potential in LAMMPS MD package for next generation architectures. He is one of the core developers of the Kokkos framework and a primary developer for it's OpenMPTarget backend. Rahul is also working on the HIP-LZ project which aims to map constructs from the HIP programming framework on Intel GPUs.
Rahulkumar Gayatri got his PhD in the area of parallel programming models from Barcelona Supercomputing Center in March 2015 under the supervision of Dr Rosa Maria Badia and Eduard Ayguade. His thesis work was on synchronization of multiple threads on a multi-core processor.
Later he worked in the HPC group at Wipro Infotech where he provided parallel programming solutions to clients. As a part of this group he worked with the MOOSE project, which simulates the behavior of a cell in a human brain. He parallelized the linear solvers that simulate the effect of chemical and electrical stimuli given to a cell.
High Performance Computing, GPU computing, Parallel Programming frameworks.
- Christian R Trott et.al., " Kokkos 3: Programming model extensions for the exascale era" IEEE Transactions on Parallel and Distributed Systems.
- Amanda S Dufek, Rahulkumar Gayatri, Neil Mehta, Douglas Doerfler, Brandon Cook, Yasaman Ghadar, Carleton DeTar "Case Study of Using Kokkos and SYCL as Performance-Portable Frameworks for Milc-Dslash Benchmark on NVIDIA, AMD and Intel GPUs" 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC)
- Rahulkumar Gayatri, Stan Moore, Evan Weinberg, Nicholas Lubbers, Sarah Anderson, Jack Deslippe, Danny Perez, Aidan P Thompson "Rapid exploration of optimization strategies on advanced architectures using testsnap and lammps" arXiv preprint arXiv:2011.12875.
- Kien Nguyen-Cong, Jonathan T Willman, Stan G Moore, Anatoly B Belonoshko, Rahulkumar Gayatri, Evan Weinberg, Mitchell A Wood, Aidan P Thompson, Ivan I Oleynik "Billion atom molecular dynamics simulations of carbon at extreme conditions and experimental time and length scales" Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2021)
- Rahulkumar Gayatri, Kevin Gott, Jack Deslippe. Comparing Managed Memory and ATS with and without Prefetching on NVIDIA Volta GPUs. PMBS Workshop, SC 2019.
- Neil A Mehta, Rahulkumar Gayatri, Yasaman Ghadar, Christopher Knight, Jack Deslippe "Evaluating performance portability of openmp for snap on nvidia, intel, and amd gpus using the roofline methodology" International Workshop on Accelerator Programming Using Directives, 2021.
- Verónica G Vergara Larrea, Reuben D Budiardja, Rahulkumar Gayatri, Christopher Daley, Oscar Hernandez, Wayne Joubert. Experiences in porting mini‐applications to OpenACC and OpenMP on heterogeneous systems. Concurrency and Computation: Practice and Experience (Journal)
- Rahulkumar Gayatri, Charlene Yang, Thorsten Kurth, Jack Deslippe : A case study for performance portability using OpenMP 4.5. WACCPD workshop, 2018
- Charlene Yang, Rahulkumar Gayatri, Thorsten Kurth et al: An empirical roofline methodology for quantitatively assessing performance portability. P3HPC Workshop, SC 2018.
- Tuomas Koskela, Zakhar Matveev, Rahulkumar Gayatri et al: A novel multi-level integrated roofline model approach for performance characterization. ISC High Performance 2018
- Rahulkumar Gayatri, Rosa.M.Badia, Eduard Ayguade, Mikel Lujan, Ian Watson: Transactional access to shared memory in StarSs, a task based programming model. Euro-Par 2012: 514-525
- Rahulkumar Gayatri, Rosa.M.Badia, Eduard Ayguade: Loop level speculation in a task based programming model. HiPc 2013: 39-48
- Rahulkumar Gayatri, Rosa.M.Badia, Eduard Ayguade : Analysis of the overheads incurred due to speculation in a task based programming model 2015.
- TERAFLUX: Harnessing dataflow in next generation teradevices - Embedded Hardware Design.:976-990 (2014)
- Parallelizing Breadth First Search Using CELL Broadband Engine. HiPC(2008)