Improving AMReX Load Balancing Algorithms

Science/CS domains

Load balancing, algorithms, high performance computing (HPC), distributed computing, network graphs

Project description

NERSC is seeking enthusiastic summer interns to investigate ways to improve the overall performance of large-scale AMReX simulations by advancing load-balancing (LB) algorithms. Load balancing is extremely important for large-scale, massively parallel simulations. Current LB algorithms are generally simplistic, as calculations must be performed at runtime and depend on the reduced data users choose to collect and pass to them. However, as simulations become more complex and Moore’s Law draws to a close, having the best possible LB is an increasing priority for the large-scale performance of an HPC code and its future research possibilities.

This investigation seeks to specifically improve the LB capabilities of AMReX, the block-structured, GPU-enabled, mesh-and-particle framework used by a wide variety of large-scale simulation codes and frameworks.

LB improvements could have far-reaching, long-term impacts, given AMReX’s international range of users, scientific fields, and portability across machines.

3D Core Collapse simulation using chimera based on AMReX

Chimera, a code used to create this model of a supernova core collapse, is just one of many scientific programs that rely on AMRex. - Credit: Oak Ridge National Laboratory (ORNL)

Project tasks

In this project, the selected summer interns will create computational tools to perform statistical analyses of LB data, investigate how to present the analysis for publication, and, time permitting, test the algorithms on AMReX applications. Where possible, interns will also be able to explore other potential improvements identified during the investigation.

Specific areas of interest for this summer include, but are not limited to, the following:

  • Network graph strategies: Investigating network graph strategies to capture and include communication costs in load balancing calculations and/or map the simulation more efficiently to the available hardware.
  • Vector or multi-dimensional partition methods: Testing and improving algorithms that are capable of tracking discretization through more than a single “weight” value to balance more discretely.
  • Code design for public contributions: Designing, implementing, and improving a modular, algorithm-focused repo for productive community development, testing, and sharing of load-balancing algorithms.

Desired skills/background

  • Experience with C++ and Python
  • Experience with algorithm development
  • Experience with statistics/statistical analysis
  • Experience with parallel codes and parallelization (MPI and/or OpenMP
  • Experience performing scientific research
  • Experience with literature surveying and algorithm design

Apply to join this project

To apply or to ask a question about this project:

Email Kevin Gott.

Project mentors

Kevin Gott

Computer Systems Engineer 3

National Energy Research Scientific Computing Center (NERSC)

Science Engagement & Workflows Dept.

User Engagement Group

Meet Kevin

Rebecca Hartman-Baker

User Engagement Group Lead

National Energy Research Scientific Computing Center (NERSC)

Science Engagement & Workflows Dept.

User Engagement Group

Meet Rebecca