Properly sized system runs out of memory on Redhat variants


#1

Properly sized system runs out of memory on Redhat variants

Problem Description

Nodes crash with out of memory errors or system monitoring shows excessive memory usage even when there is adequate RAM on the server and Aerospike memory usage has been constrained using the memory-size parameter. This will affect Aerospike running on Redhat, Oracle Linux and Centos. Migrations are not ongoing. Examination of Aerospike logs will show that memory usage is not excessive, external tools will show that the Aerospike process (asd) is consuming the memory. When checked /var/log/messages will show entries such as the below:

var/log/messages-20160327:Mar 23 20:04:15 rtdb23.datacenter.myhost.com 
kernel: asd: page allocation failure: order:0, mode:0x20 ??

Explanation

There is a known page allocation issue within Linux kernels prior to 2.6.32-358.el6 as discussed in the following article:

https://access.redhat.com/solutions/90883

Solution

The best long term solution is to update the OS to kernel-2.6.32-358.el6 or higher. In the short term the following system parameters can be adjusted to work around the issue.

vm.min_free_kbytes

vm.zone_reclaim_mode

These parameters can be tuned dynamically with sysctl -w and set permanently by making the requisite entries in /etc/sysctl.conf

Notes

  • Further information on page allocation failures can be found here:

Keywords

MEMORY LEAK ASD CLD EXCESSIVE PAGE ALLOCATION

Timestamp

4/22/16