Memory Usage Considerations on Edison
Edison compute nodes have 64 GB of physical memory (2.67GB per core), but, not all that memory is available to user programs. Compute Node Linux (the kernel), the Lustre file system software, and message passing library buffers all consume memory, as does loading the executable into the memory. Thus the precise memory available to an application varies. Approximately 61 GB of memory can be allocated from within an MPI program using all 24 cores per node, i.e., 2.5 GB per MPI task on average. If an application uses 12 MPI tasks per node, then each MPI task could use about 5.0 GB of memory.
If you see an error message, "OOM killer terminated this process.", with your job, it means that your code has exhausted the memory available on the node. OOM stands for out of memory. One simple thing you can try when your code runs into OOM error is to use more nodes and fewer cores per node. You can choose to launch fewer than 24 tasks per node to increase the memory available for each MPI task. Note though, that your account will be charged for all 24 cores per node. Please refer to our website about how to run jobs on unpakced nodes.
You can change MPI buffer sizes by setting certain MPICH environment variables. See the man page for intro_mpi for more details.