Page Allocation Failure


#1

Symptoms

General instability of the cluster, connections time out from client.

Following messages in kernel logs (dmesg):

cld: page allocation failure. order:1, mode:0x20

or

asd: page allocation failure. order:1, mode:0x20

Example trace from 2.x:

Call Trace:
[<ffffffff8112c207>] ? __alloc_pages_nodemask+0x757/0x8d0
[<ffffffffa0228ff9>] ? bond_3ad_xmit_xor+0x1c9/0x220 [bonding]
[<ffffffff81166ab2>] ? kmem_getpages+0x62/0x170
[<ffffffff811676ca>] ? fallback_alloc+0x1ba/0x270
[<ffffffff8116711f>] ? cache_grow+0x2cf/0x320
[<ffffffff81167449>] ? ____cache_alloc_node+0x99/0x160
[<ffffffff811683cb>] ? kmem_cache_alloc+0x11b/0x190
[<ffffffff81439d58>] ? sk_prot_alloc+0x48/0x1c0
[<ffffffff8143ae32>] ? sk_clone+0x22/0x2e0
[<ffffffff81489d66>] ? inet_csk_clone+0x16/0xd0
[<ffffffff814a2c73>] ? tcp_create_openreq_child+0x23/0x450
[<ffffffff814a046d>] ? tcp_v4_syn_recv_sock+0x4d/0x310
[<ffffffff814a2a16>] ? tcp_check_req+0x226/0x460
[<ffffffff81498436>] ? tcp_rcv_state_process+0x126/0xa10
[<ffffffff8149ff0b>] ? tcp_v4_do_rcv+0x35b/0x430
[<ffffffff81438e35>] ? release_sock+0x65/0xe0
[<ffffffff8148a70f>] ? inet_csk_accept+0x8f/0x240
[<ffffffff811826b8>] ? alloc_file+0x98/0xe0
[<ffffffff814b08b4>] ? inet_accept+0x34/0xe0
[<ffffffff814369f5>] ? sys_accept4+0x155/0x2b0
[<ffffffff810dc8f7>] ? audit_syscall_entry+0x1d7/0x200
[<ffffffff81436b60>] ? sys_accept+0x10/0x20
[<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b

Details

This is caused when the OS is not able to reclaim memory fast enough.

All linux systems will attempt to make use of all physical memory available to the system, often through the creation of a filesystem buffer cache, which put simply is an I/O buffer to help improve system performance. Technically this memory is not in use, even though it is allocated for caching.

Some processes cannot “wait for reclaim”. “wait for reclaim”, refers to the process of reclaiming that cache memory that is “not in use” so that it can be allocated to a process. This is supposed to be transparent but in the real world there are many processes that do not wait for this memory to become available. For example, during a network interrupt. The process tries to allocate the memory and if it is not instantly available in one large contiguous (atomic) chunk, the process dies.

Reserving a certain amount of memory with min_free_kbytes allows this memory to be instantly available and reduces the memory pressure when new processes need to start, run and finish while there is a high memory load and a full buffer cache.

Some Links:

http://stackoverflow.com/questions/21374491/vm-min-free-kbytes-why-keep-minimum-reserved-memory

http://activedoc.opensuse.org/book/opensuse-system-analysis-and-tuning-guide/chapter-15-tuning-the-memory-management-subsystem

http://www.linuxatemyram.com

Some more details:

Atomic allocations are requests for memory that must be satisfied without giving up control (i.e. the current thread can not be suspended). This happens most often in interrupt routines, but it applies to all cases where memory is needed while holding an essential lock. These allocations must be immediate, as you can’t afford to wait for the swapper to free up memory.

Here is the memory allocation process in short:

  • _alloc_pages first iterates over each memory zone looking for the first one that contains eligible free pages

  • _alloc_pages then wakes up the kswapd task […to…] tap into the reserve memory pools maintained for each zone.

  • If the memory allocation still does not succeed, _alloc pages will either give up […] In this process _alloc_pages executes a cond_resched() which may cause a sleep, which is why this branch is forbidden to allocations with GFP_ATOMIC.

Tools

To check memory used:

  • top and then look for Mem: total, used and Swap: cached.

  • free -m

To free up cached memory:

sync ; echo 3|tee /proc/sys/vm/drop_caches

To reserve memory for such allocation:

/proc/sys/vm/min_free_kbytes This controls the amount of memory that is kept free for use by special reserves including “atomic” allocations (those which cannot wait for reclaim)

To check the current value:

/sbin/sysctl vm.min_free_kbytes

To configure these changes open /etc/sysctl.conf and comment any reference to vm.min_free_kbytes, and then add the desired values above to the end of the file.

For example, for 256MB:

vm.min_free_kbytes = 262144

After saving the change, run sudo sysctl -p

Other parameters that may be helpful: swappiness, vfs_cache_pressure.


Properly sized system runs out of memory on Redhat variants
Tuning Kernel Memory for Performance