We have an Aerospike cluster of 8 nodes (v 18.104.22.168). Some of its key stats are as follows:
- Total No of Records - 1.2B
- Total Disk Usage - 740GB
- Total Disk Capacity - 2TB
I’m trying to take a backup of all the data using asbackup tool from a separate backup node. After initiating the backup, aerospike server nodes are maxing out on the disk. iostat is showing 100% util, with the read throughput of ~200 MBps on each node. Since it’s a complete backup, I would expect the aerospike node to transfer a similar amount of data to the backup node. However, aerospike nodes are sending only ~6MBps data each to the backup node. Correspondingly, I’m getting only around 50MBps throughput for the backup. Is having such a huge difference between disk and io throughput expected? If so, how it can be explained?
I found the behaviour to be similar during scan jobs. And eventually, the aerospike node is reading much more data from the disk (read throughput x total time for scan) than the disk usage or even the disk size.