How does Aerospike client find a node

Aerospike_Knowledge · December 16, 2014, 2:24am

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

Synopsis

Customer often ask the process behind a client connecting to a particular node.

Resolution

Data Distribution via a Hash (Aerospike Smart Partitions)

In order to understand how a client knows about the nodes of a cluster we need to understand how data and traffic are distributed.

Aerospike uses a very random hash (the RIPEMD 160) to make sure that both data volume and traffic are evenly distributed. This occurs automatically and without the need for manual intervention.

To determine where a record should go, the record key (of any size) is hashed into a 20-byte fixed length string using RIPEMD160, and the first 12 bits form a partition ID which determines which of the partitions should contain this record. The partitions are distributed equally among the nodes in the cluster, so if there are N nodes in the cluster, each node stores approximately 1/N of the data.

“Partitions” are buckets of records that have been grouped together for the purpose of distribution.

In order to evenly divide data between different nodes, all data is mapped to one of 4096 partitions, based on the hash (digest) value. In turn each of these partitions is mapped to the different nodes. If the number of nodes change, the partitions will get remapped and transferred to the appropriate location in a process called “migration.”

Partition Map Every record in an Aerospike cluster will be mapped to one of 4096 partitions. The basis for this mapping is the hash of the key. 4 bytes of the hash are used to determine to which partition any given key belongs. These partitions are then divided among the nodes in the cluster in a partition map.

Because data is distributed evenly (and randomly) across nodes, there are no hot-spots or bottlenecks where one node handles significantly more requests than another node. To determine where a record should go, the record key (of any size) is hashed into a 20-byte fixed length string using RIPEMD160, and the first 12 bits form a partition ID which determines which of the partitions should contain this record. The partitions are distributed equally among the nodes in the cluster, so if there are N nodes in the cluster, each node stores approximately 1/N of the data.

There is no need for manual sharding. The nodes in the cluster coordinate among themselves to divide the partitions. The client detects cluster changes and sends requests to the correct node. When nodes are added or removed, the cluster automatically re-balances. All of the nodes in the cluster are peers – there is no single database master node that can fail and take the whole database down.

When the database creates a record, a hash of the record key is used to assign the record to a partition. Hashing is deterministic – that is, the hashing process always maps a given record to the same partition. Data records stay in the same partition for their entire life. Partitions may move from one server to another, but partitions would not normally split or reassign a record to another partition.

Info Protocol

Aerospike Client APIs will track cluster state changes. (addition, removal of nodes) At any given instant, the client uses the info protocol to communicate periodically with the cluster and maintain a list of nodes that form the cluster. It also uses the Aerospike Smart Partitions™ algorithm to determine which node stores a particular partition of data. Any changes to the cluster size are tracked automatically by the Client, and such changes are entirely transparent to the Application. In practice, this means that transactions will not fail during the transition, and the Application does not need to be restarted during node arrival and departure.

Kai_Guo · June 3, 2015, 3:18am

As we know, the first 12 bits form a partition ID. So I wonder why it says '4 bytes of the hash are used to determine to which partition any given key belongs. ’

kporter · June 3, 2015, 4:09am

Yes, @Kai_Guo is correct. Thanks for letting us know about this error, will see that this is corrected.

Kai_Guo · June 3, 2015, 7:33am

In fact, we can say there are two mapping between a record and a node. The first is from record key to partition id by a very random hash (the RIPEMD 160) while the second is from partitions to nodes which puzzles me.

In Aerospike documents(Data Distribution), it says

Data is distributed evenly across nodes in a cluster using the Aerospike Smart Partitions™ algorithm.

In my view, this Smart Partition seems like this : if there are 3 nodes in cluster, then maybe node 1 has partition 0, 3, 6, …, node 2 has partition 1, 4, 7,…, node 3 has partition 2, 5, 8, … In this way, partitions are distributed relatively evenly.

samir · June 3, 2015, 9:35am

Kai,

Your understanding is correct. The part of record key hash (RIPEMD160) determines the partition id where the record should be written/read. The partition map is looked up to determine which node is master and replica for the specific partition id so that the read/write operation could be performed on the appropriate node & partition.

The record key hash is calculated for every read/write operation. The partition map could change with every cluster state change (node being added or removed from cluster) and hence the partition map is referred to find out which node the partition id belongs to.

-samir

vikrantl · August 29, 2020, 4:09pm

How can I find the nodes having a key-value with them .vs not having with them?

For example for a three node cluster with replication factor to, how can I find which two nodes are hosting the key I have inserted just now thanks -Vikrant

rbotzer · September 25, 2020, 12:30am

You can get the partitionId from the record’s digest:

github.com

aerospike/aerospike-client-java/blob/0926f04a6f68b6346c79a9a9acc4c53d99401e63/client/src/com/aerospike/client/cluster/Partition.java#L126-L130


      
          	public static int getPartitionId(byte[] digest) {
          		// CAN'T USE MOD directly - mod will give negative numbers.
          		// First AND makes positive and negative correctly, then mod.
          		return (Buffer.littleBytesToInt(digest, 0) & 0xFFFF) % Node.PARTITIONS;
          	}

And you can then find the node from the partition map.

Why do you need to figure out which nodes have the copies of the key? The Aerospike client does that for you and goes to the correct node in a single hop.

Since this may become a separate thread, can you re-ask your question under a different category?

Topic		Replies	Views
Partition map logic & partitioning algorithm How Aerospike Works index	6	4171	April 27, 2020
Mechanism for distribution of partitions among nodes How Aerospike Works	7	3801	September 23, 2015
How to make application aware of partitions? Java Client	5	1133	May 5, 2020
How does query work in Aerospike? query	5	211	November 9, 2023
How Aerospike manage partition map? Migration	6	1770	November 11, 2019

How does Aerospike client find a node

Synopsis

Resolution

Info Protocol

Related Topics