Horizontal scaling (v3.16.0.6)


#1

We are using aerospike v3.16.0.6 in our production system. We have a cluster of 2 aerospike nodes with replication factor 2. If my understanding is right it should be possible to horizontally scale disk space on aerospike by adding more nodes but keep the replication factor the same (2). Would addition of the 3rd node mean the data held by each node is now 2/3? Addition of a 4th nodes would make it 2/4?

Also what are the implications on memory usage. Would that reduce by the same factor? I am tryign to understand if horizontal scaling applies to both disk and memory size.

Many thanks in advance. Regards. Deepak.


#2

Hi @deepakp. There are 4096 partitions for each namespace. If you have a replication factor of 2 and 2 nodes in the cluster, each of those nodes has 2K master partitions and 2K replica partitions. You’ll never have the master and replica of the same partition ID on the same node. If you scale to 4 nodes, each node will have 1K master partitions and 1K replica partitions. For more see the architecture overview page on data distribution.

Aerospike scales well both horizontally, by adding more nodes, and vertically by adding more resources per-node such as drives. See the following FAQ:

Each node indexes only the records in the partitions assigned to it. Having double the nodes will reduce the memory usage of the primary index on each node to half what it was before. You can also scale vertically and double the DRAM per-node. See the Capacity Planning Guide.


#3

Excellent. Thank you.


#4

No problem. You can scale your writes by adding drives to the node (vertically), or by adding nodes with the same number of drives (horizontally). There’s also the impact of partitioning.

To be super clear, there’s a write queue per-device, and in Linux a device is a partition, not a drive. To mitigate temporary peaks the streaming write buffers go into the write queue, whose size is controlled by max-write-cache. Partitioning the drive and assigning those devices gives you more of that cache if you wish, and also more write threads associated with the extra write queues. This doesn’t change the characteristics of the drive - partitioning doesn’t add IOPS, but it can have an impact on smoothing out write spikes.

You don’t really want to see the write-q value increasing in your logs - it’s a sign the drives aren’t keeping up sometimes, so a consideration for capacity planning and scaling. Each device has its own line:

{namespace} /dev/sda: used-bytes 296160983424 free-wblocks 885103 write-q 0
write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1)
defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)

How to configure number of threads writting to Disk