The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.
FAQ - How does Aerospike defragmentation behave with respect to write queues
Detail
For each device associated with an Aerospike namespace there is a write queue. The function of the write queue is to allow data written to streaming write buffers (SWBs
) to be held so that it can be flushed asynchronously to disk (except when leveraging the commit-to-device
feature). The write queues exist in memory and so must be bounded to avoid excessive consumption and, potentially, OOM kills. The max-write-cache
configuration parameter provides such boundary. Note that some traffic, such as replica writes and migration writes are not bounded by the max-write-cache
.
If the load on the system is very high or if devices are underspecified and failing the write queues can pile up. How does defragmentation behave in this circumstance?
Answer
Prior to Aerospike 5.1 each device’s write queue was viewed in isolation. When the max-write-cache
limit on the queue was reached the server would report queue too deep
in server logs and client writes would fail with Error 18 - device overload
. Writes on the write queue coming from replica writes, migration or defragmentation would be added on to the queues regardless of how full they were. This can become problematic when activities that generate large scale deletes are carried out. These activities, such as truncate or migration imply a heavy point load in terms of defragmentation. Outbound migration involves entire partitions being dropped in one fell swoop when partition ownership changes and the impact on the defragmentation queue and subsequently the write queues can be pronounced. Left unchecked this can cause OOM kills unless defragmentation is throttled using defrag-sleep
.
From Aerospike 5.1 new functionality was introduced to reduce the impact of defragmentation on the write queues and make the write queues more flexible to point loading. The change (tracked under [AER-6234] - (STORAGE) Added throttling to prevent defrag from overwhelming the write queue) is twofold in nature:
-
max-write-cache
used to be a configuration that applies per device. So in a 10 device namespace, as soon as 1 device crossed themax-write-cache
configured value, the namespace would start sendingqueue too deep / device overload
errors. In the new implementation, for a 10 device namespace, the threshold is across all devices. This means thatqueue too deep
errors will only occur if the total number of pendingSWBs
exceeds (10 xmax-write-cache
). This allows the system to cope with either 1 device with that many pendingSWBs
, or a situation where all devices are lagging bymax-write-cache
. -
In the new implementation defragmentation is allowed to continue until the total write queue is 100 blocks above the configured limit of (# of devices x
max-write-cache
). At that point, the system stops defrag writes. Replica writes and migration writes (if any) will still continue. The system checks atdefrag-sleep
intervals whether it is still above the (100 + number of devices xmax-write-cache
). As soon it is back below this threshold, defrag writes are resumed but client writes will still be rejected, until the queue is back down to the (number of devices xmax-write-cache
).
Notes
- Understanding defragmentation
- Understanding device qrite queue impact on Aerospike memory consumption
- Why do I see
write fail - queue too deep
- The write queue size can be checked on a per device basis:
- Under the storage-engine.device[ix].write_q statistic.
- In the log file:
{namespace_name} /dev/sda: used-bytes 296160983424 free-wblocks 885103 write-q 0 write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1) defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)
Keywords
WRITE_QUEUE WRITE_Q TOO DEEP DEVICE OVERLOAD OOM MIGRATE REPLICA WRITE MEMORY
Timestamp
November 2020