The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.
Synopsis
Customer often ask the process behind a client connecting to a particular node.
Resolution
Data Distribution via a Hash (Aerospike Smart Partitions)
In order to understand how a client knows about the nodes of a cluster we need to understand how data and traffic are distributed.
Aerospike uses a very random hash (the RIPEMD 160) to make sure that both data volume and traffic are evenly distributed. This occurs automatically and without the need for manual intervention.
To determine where a record should go, the record key (of any size) is hashed into a 20-byte fixed length string using RIPEMD160, and the first 12 bits form a partition ID which determines which of the partitions should contain this record. The partitions are distributed equally among the nodes in the cluster, so if there are N nodes in the cluster, each node stores approximately 1/N of the data.
“Partitions” are buckets of records that have been grouped together for the purpose of distribution.
In order to evenly divide data between different nodes, all data is mapped to one of 4096 partitions, based on the hash (digest) value. In turn each of these partitions is mapped to the different nodes. If the number of nodes change, the partitions will get remapped and transferred to the appropriate location in a process called “migration.”
Partition Map Every record in an Aerospike cluster will be mapped to one of 4096 partitions. The basis for this mapping is the hash of the key. 4 bytes of the hash are used to determine to which partition any given key belongs. These partitions are then divided among the nodes in the cluster in a partition map.
Because data is distributed evenly (and randomly) across nodes, there are no hot-spots or bottlenecks where one node handles significantly more requests than another node. To determine where a record should go, the record key (of any size) is hashed into a 20-byte fixed length string using RIPEMD160, and the first 12 bits form a partition ID which determines which of the partitions should contain this record. The partitions are distributed equally among the nodes in the cluster, so if there are N nodes in the cluster, each node stores approximately 1/N of the data.
There is no need for manual sharding. The nodes in the cluster coordinate among themselves to divide the partitions. The client detects cluster changes and sends requests to the correct node. When nodes are added or removed, the cluster automatically re-balances. All of the nodes in the cluster are peers – there is no single database master node that can fail and take the whole database down.
When the database creates a record, a hash of the record key is used to assign the record to a partition. Hashing is deterministic – that is, the hashing process always maps a given record to the same partition. Data records stay in the same partition for their entire life. Partitions may move from one server to another, but partitions would not normally split or reassign a record to another partition.
Info Protocol
Aerospike Client APIs will track cluster state changes. (addition, removal of nodes) At any given instant, the client uses the info protocol to communicate periodically with the cluster and maintain a list of nodes that form the cluster. It also uses the Aerospike Smart Partitions™ algorithm to determine which node stores a particular partition of data. Any changes to the cluster size are tracked automatically by the Client, and such changes are entirely transparent to the Application. In practice, this means that transactions will not fail during the transition, and the Application does not need to be restarted during node arrival and departure.