Powering Scientific Discovery Since 1974

# Introduction to Scientific I/O

## Scientific I/O in HDF5

We will now show how HDF5 can be used to store the following three types of datasets commonly found in scientific computing:

• rectilinear 3D grids with balanced, cubic partitioning;
• rectilinear 3D grids with unbalanced, rectangular partitioning; and
• contiguous but size-varying arrays, such as those found in adaptive mesh refinement.

### Balanced 3D Grids

Assume we have a 3D grid of known dimension grid_size[3] and a partitioning partition[3] of the grid into equally sized blocks that are distributed across MPI tasks. We can write a task's block into an HDF5 file using the following code:

`/* map the 1D MPI rank to a block in the 3D grid */grid_index[0] = mpi_rank % partition[0];grid_index[1] = (mpi_rank / partition[0]) % partition[1];grid_index[2] = mpi_rank / (partition[0] * partition[1]);/* calculate the size of the block (same for all tasks) */block_size[0] = grid_size[0] / partition[0];block_size[1] = grid_size[1] / partition[1];block_size[2] = grid_size[2] / partition[2];/* the file will hold the entire grid */filespace = H5Screate_simple(3, grid_size, NULL);/* while the hyperslab specifies only the block */offset[0] = grid_index[0] * block_size[0];offset[1] = grid_index[1] * block_size[1];offset[2] = grid_index[2] * block_size[2];H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL, block_size, NULL);`

### Unbalanced 3D Grids

Now suppose that not all tasks share the same block size. If we are only given the dimension of the block on each task as block_size[3], we will need to gather the block sizes from all tasks to be able to calculate the offsets. The offsets for a given block are simply the sum of block sizes that precede the block in each dimension. These can be calculated with the following code:

`MPI_Allgather(block_size, 3, MPI_INT, all_block_sizes, 3, MPI_INT, MPI_COMM_WORLD);for (i=0; i<grid_indx[0]; i++) {    offset[0] += all_block_sizes[3*get_rank_of_block(i, grid_index[1], grid_index[2]) + 0];}for (i=0; i<grid_indx[1]; i++) {    offset[1] += all_block_sizes[3*get_rank_of_block(grid_index[0], i, grid_index[2]) + 1];}for (i=0; i<grid_indx[2]; i++) {    offset[2] += all_block_sizes[3*get_rank_of_block(grid_index[0], grid_index[1], i) + 2];}`

Note that we can calculate the MPI rank of a given grid index using the formula:

`get_rank_of_block(i,j,k) = i + j * partition[0] + k * partition[0] * partition[1]`

### 1D AMR Arrays

Storing a flattened 1D AMR array is similar to the unbalanced 3D grid case. If each MPI task has nelems in its local array, then MPI_Scan can be used to find the offset into the array on disk as follows:

`memspace = H5Screate_simple(1, &nelems, NULL);MPI_Allreduce(&nelems, &sum, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);filespace = H5Screate_simple(1, &sum, NULL);MPI_Scan(&nelems, &offset, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);H5Sselect_hyperslab(filespace, H5S_SELECT_SET, &offset, NULL, &nelems, NULL);`