How To Tune the Linux OS for Aerospike Servers - Best Practices

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

How To Tune the Linux OS for Aerospike Servers - Best practices

Context

When running Aerospike, overall performance and responsiveness can be improved by tuning some of the kernel parameters. This article describes what can be tuned and suggests some optimum values.

Method

The following parameters can be used to tune the operating system for Aerospike usage.

min_free_kbytes

The min_free_kbytes parameter specified determines how much RAM should always be kept free, away from system r/w page caches. Setting this parameter will improve the overall system stability and allow Aerospike to work more efficiently.

The default for min_free_kbytes, while suitable for most applications, is too low for a low latency database such as Aerospike. In order to take full advantage of system caching and at the same time allow Aerospike to perform memory allocations without delay, it is advisable to set this parameter to at least 1.1GB. This gives at least 1GB for Aerospike key allocation in shared memory and at least another 100MB RAM for all other purposes including the OS small allocations (i.e. a total value of 1153434).

Specific details on how to set the parameter are documented in the Tuning Kernel Memory for Performance article.

swappiness

Aerospike advises either reducing swappiness to 0 or not using swap altogether. For low-latency operations, using swap to any extent will drastically slow down performance. Reducing swappiness will also improve system behavior in terms of swapping out dirty and clean page caches.

Another parameter aimed at reclaiming memory exists, zone_reclaim. Aerospike advises that this is disabled as it causes aggressive reclaims and memory scans. The default for zone_reclaim is disabled for most modern Linux distributions.

Details on how to set swappiness to 0 and ensure that zone_reclaim is disabled are documented in the Tuning Kernel Memory for Performance article.

AWS only - ENA (enhanced networking adapter)

AWS has a network driver called ENA. Installing and using this driver drastically improves the networking speed of AWS instances.

ENA is enabled if the module concerned is in use, this is checked as follows:

$ ethtool -i eth0 | grep ixgbevf

If the module is not being used, it can be installed or upgraded, by following this manual.

THP - transparent huge pages

In order to improve overall system responsiveness and allocation speed, The Linux kernel has a feature called THP, Transparent Huge Pages Unfortunately, for high-throughput and low-latency databases, which perform multiple small allocations, this can be counter productive. Having THP can cause the system to run out of RAM, with similar symptoms to a memory leak. Another issue is latency caused by THP defragmentation page locking.

In order to disable THP, refer to the Disabling Transparent Huge Pages for Aerospike article.

IRQ Balancing

Some network drivers will automatically balance network cards. Unfortunately, if they do not, multiple queues for a network interface could reside on a single CPU core. If this happens the irq or soft sides of a single core can use up to 100% of that core. This will be visible in the output of the mpstat command. If that happens, IRQ balancing should be enabled.

On newer kernels, installing the irq balance package should be enough for the correct balancing to happen. Unfortunately, on older kernels, this may not work as expected.

In order to check and reassign IRQ balance manually for network card queues, please follow this manual

Please note that if there is only one queue for a network interface in the /proc/interrupts file, the network card driver must be checked as more queues should be added. The network card vendor can advise on this.

The following article discusses how to read /proc/interrupts and what the values inside it mean.

Notes

The above steps make use of changing system-wide kernel parameters. While performing these should be safe, it is advisable to make said changes in a development environment first, to ensure no unforeseen effects for the particular use case and workloads.

Keywords

ENA THP IRQ BALANCING MIN_FREE_KBYTES SWAPPINESS ENHANCED NETWORKING TRANSPARENT HUGE PAGES

Timestamp

June 24 2019