Memory Usage Considerations on Edison
Edison compute nodes have 64 GB of physical memory (2.67GB per core), but not all the memory is available to user programs. Compute Node Linux (the kernel), the Lustre file system software, and message passing library buffers all consume memory, as does loading the executable into the memory. Thus the precise memory available to an application varies. Approximately 61 GB of memory can be allocated from within an MPI program using all 24 cores per node, i.e., 2.5 GB per MPI task on average. If an application uses 12 MPI tasks per node, then each MPI task could use about 5.0 GB of memory.
If you see an error message, "OOM killer terminated this process." in your job output, it means that your code has exhausted the memory available on the node (OOM stands for "out of memory"). One simple thing you can try when your code runs into an OOM error is to use more nodes and fewer cores per node. You can choose to launch fewer than 24 tasks per node to increase the memory available for each MPI task. Note that your account will be charged for all 24 cores per node, regardless of how many cores you actually use. Please refer to our website about how to run jobs on "unpacked" nodes.
You can change MPI buffer sizes by setting certain MPICH environment variables. See the "intro_mpi" man page for more details.