If your job exceeds the memory request you gave in your PBS job script you may receive a message like the following:
This message indicates that your job is using more memory than you requested in your job script's
#PBS -lmem directive or the amount you requested in your qsub's
When your job exceeds the memory request given on any one of the nodes in the job, the message above will be produced and it is likely that the kernel will kill one of the processes in your job. Which process is killed depends on the memory pressure, which process is attempting to allocate more memory and potentially other factors. This will likely cause your job to fail.
How is memory allocated to my job?
When you submit a PBS job you let PBS know the maximum amount of memory your job will use across all nodes for the job. The total memory request for your job is divided by the number of nodes your job is requesting and allocated evenly to each node.
For example, if you request 128 CPUs in the normal queue, and 128GB of RAM, your job will be using 8 nodes (there are 16 CPU cores in each node in the normal queue on Raijin) and 16GB of RAM will be allocated on each of the 8 nodes.
How do I read the memory request exceeded message?
The first line of the message tells you which job exceeded the memory allocation, and what node the memory allocation was exceeded on. This may be helpful in determining which of the nodes in a multi-noded job is exceeding memory if there is a memory imbalance between nodes.
The subsequent lines indicate every process running as part of your job on that node: the process name and process ID are indicated and help identify which process in your job is using significant memory or is not balanced with other processes. The RSS value is the "resident set size" or actively used memory for that process in bytes. The vmem value is the virtual memory address space of the process; this may be significantly larger than your memory usage as virtual memory address space may be allocated by some processes but never actually used.
Why does the job summary at the end of my output not show the same usage as the exceeded message?
The memory usage in the job summary at the end of your job output is generated by sampling the job's overall memory usage on a regular basis. If your memory usage has spiked up since the last sample, and then your job is killed by the kernel for exceeding your memory allocation, the memory usage shown in the job summary will be less than your final memory usage. The memory usage shown in the exceeded memory usage message is current for the node indicated at the point your job exceeded its memory allocation.
What should I do?
As a first step you can increase your memory request to accomodate your memory usage. Remember that if you ask for significantly more memory than required you may slow down your job starting as there must be more free memory to run your job than was necessary. This is especially important if you request more than 64GB/node in the normal queue on Raijin as there are only 72 nodes with more memory than this. If you do not need this much memory you will unnecessarily delay the start of your job, as well as those of other users who do need this much memory.
If your memory usage looks higher than expected for your jobs, you may wish to look into ways to reduce memory usage of the job either through configuration options or input file options, or changing the code if you are developing your own code. Note that for applications that allow you to tweak their memory usage through configuration or input file options, using more memory often does not actually result in improved performance. You can experiment with different values to determine the most efficient options for your jobs.