Properly sized system runs out of memory on Redhat variants
Nodes crash with out of memory errors or system monitoring shows excessive memory usage even when there is adequate RAM on the server and Aerospike memory usage has been constrained using the memory-size parameter. This will affect Aerospike running on Redhat, Oracle Linux and Centos. Migrations are not ongoing. Examination of Aerospike logs will show that memory usage is not excessive, external tools will show that the Aerospike process (asd) is consuming the memory. When checked /var/log/messages will show entries such as the below:
var/log/messages-20160327:Mar 23 20:04:15 rtdb23.datacenter.myhost.com kernel: asd: page allocation failure: order:0, mode:0x20 ??
There is a known page allocation issue within Linux kernels prior to 2.6.32-358.el6 as discussed in the following article:
The best long term solution is to update the OS to kernel-2.6.32-358.el6 or higher. In the short term the following system parameters can be adjusted to work around the issue.
These parameters can be tuned dynamically with sysctl -w and set permanently by making the requisite entries in /etc/sysctl.conf
- Further information on page allocation failures can be found here:
MEMORY LEAK ASD CLD EXCESSIVE PAGE ALLOCATION