NERSCPowering Scientific Discovery Since 1974

Memory Considerations

Overview

Carver login nodes each have 48GB of physical memory. Most compute nodes have 24GB; however, 80 compute nodes have 48GB. Not all of this memory is available to user processes. Some memory is reserved for the Linux kernel. Furthermore, since Carver nodes have no disk, the "root" file system (including /tmp) is kept in memory ("ramdisk"). The kernel and root file system combined occupy about 4GB of memory. Therefore users should try to use no more than 20GB on most compute nodes, or 44GB on the large-memory compute nodes.

There are also two "extra-large" memory nodes; each node has four 8-core Intel X7550 ("Nehalem EX") 2.0 GHz processors (32 cores total) and 1TB memory.  These nodes are available through the queue "reg_xlmem". Please refer to the "Extra-large Memory Nodes" page. 

Memory Limits

Carver compute nodes have no disk for swapping virtual memory. If a user job tries to use more physical memory than is available, it can cause severe problems for the operating system, possibly leading to system crashes and/or hangs. Therefore, per-process memory limits are enforced on all login and compute nodes.

Type of NodeSoft LimitHard Limit
Login Node 2GB 2GB
24GB Compute Node 2.5GB 20GB
48GB Compute Node 5.5GB 44GB

The above compute node "soft" limits were chosen to allow typical MPI programs running fully "packed" (i.e., 8 processes per node) safely to access the maximum amount of memory. There are cases where it is desirable to run fewer processes per node than the number of cores. These include:

  •     A single process (possibly multithreaded) that needs access to the entire hard limit.
  •     An MPI application where each process needs access to more than the default soft limit.
  •     A "mixed-model" application where each MPI process is multithreaded.

In the above cases, it will be necessary to override the default soft limit. This may be done with the PBS resource pvmem. This resource requires an integer value, so it is sometimes necessary to specify it in megabytes instead of gigabytes. The following table shows appropriate values for pvmem depending on the number of processes per node (PBS resource ppn):

ppn24GB Compute Node48GB Compute Node
1 pvmem=20GB pvmem=44GB
2 pvmem=10GB pvmem=22GB
3 pvmem=6826MB pvmem=15018MB
4 pvmem=5GB pvmem=11GB
5 pvmem=4GB pvmem=9011MB
6 pvmem=3413MB pvmem=7509MB
7 pvmem=2925MB pvmem=6436MB
8 not needed (default) not needed (default)

Note:  The product ppn*pvmem must be no greater than 20GB for 24GB nodes, or no greater than 44GB for the 48GB nodes. Jobs that specify total memory sizes greater than these values will be queued but will never run

For example, to run a job that requires 8 MPI processes, each having access to 10GB of memory, would require 4 8-core nodes:

#PBS -l nodes=4:ppn=2
#PBS -l pvmem=10GB
#PBS -l walltime=00:30:00

cd $PBS_O_WORKDIR
mpirun -np 8 ./a.out

Interactively the command is:

qsub -I -V -q interactive -l nodes=4:ppn=2 -l pvmem=10GB -l walltime=00:30:00 

Large-Memory Nodes

160 Carver compute nodes have 48 GB of memory, rather than the 24 GB found on most nodes. To request these large-memory nodes, use the "bigmem" option when requesting nodes:

#PBS -l nodes=4:ppn=8:bigmem
#PBS -q regular
#PBS -l walltime=00:10:00

cd $PBS_O_WORKDIR
mpirun -np 32 ./my_big_executable

In this script, the user is requesting 4 nodes that each contain 48 GB of memory. Note that it might take longer for such a job to start, as the batch system must wait for the desired nodes to become available.

Extra-Large Memory Nodes

Please refer to the Extra-Large Memory Node page. 

Serial Jobs

Jobs submitted to the serial queue should specify memory requirements for efficient job scheduling.  These jobs will run on 12-core, 48GB nodes.  By default, a serial job will be limited to 3.5GB, but this can be increased up to 44GB through the use of the "pvmem" directive.  The following job requests a single core, and 10GB of memory, for 12 hours:

#PBS -q serial
#PBS -l pvmem=10GB
#PBS -l walltime=12:00:00

cd $PBS_O_WORKDIR
./a.out

Note that the argument to pvmem must be an integer.  Therefore it is sometimes necessary to use different units.  For example, if you want 7.5GB, you should specify:

#PBS -l pvmem=7500MB

Note:  A serial job that specifies a pvmem value greater than 44GB will queued, but will never run.