Device Overload


#1

Aerospike does not synchronously flush swb (streaming write buffers) to disk. These blocks will be flushed when full (or based on some other tuning parameter, but let’s keep those aside for now). Write transactions do get committed to memory on master and replica(s) before returning to clients though.

When a storage device is not keeping up, Aerospike uses cache configured through (max-write-cache) and will try to keep up until a certain point (when this cache is full) and will then throw those device overload error. Therefore you may not see as much direct latency impact.

You can dynamically increase this cache from the default (64M) to a higher multiple of the write-block-size (which is by default 128KB for SSD devices).

For example:

asinfo -v 'set-config:context=namespace;id=test;max-write-cache=128M'

This will increase the number of swb in cache from 512 to 1024 (assuming 128KB block size).

Some links with a tiny bit of info on those config parameters:

http://www.aerospike.com/docs/reference/configuration/#max-write-cache http://www.aerospike.com/docs/reference/configuration/#write-block-size

You also can check the following stat (w-q) in the logs to see how many of those cache swb are used:

device /dev/sdc: used 296160983424, contig-free 110637M (885103 wblocks), swb-free 16,
w-q 0 w-tot 12659541 (43.3/s), defrag-q 0 defrag-tot 11936852 (39.1/s)

Details at: http://www.aerospike.com/docs/reference/serverlogmessages/


Device overload when map size is too big