How to remove rackaware configuration when upgrading to the new cluster protocol (3.13)

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

How to remove rackaware configuration when upgrading to the new cluster protocol (3.13)

Context

You may want to remove the rackaware configuration when upgrading to the new cluster protocol introduced in version 3.13, if it is no longer required. You need to be aware though, that this will result in node-id changes, and therefore a reordered succession list. The end result is that more migrations will need to happen once this is completed.

Method

  1. remove the cluster {} stanza and ensure there is no rack-id configured for any namespaces in the configs
  2. dynamically reassign all nodes to rack-id 0, therefore turnning off rack-awareness. This will change the configuration dynamically but will not apply it:
$ asadm> asinfo -v "set-config:context=namespace;id=XXXX;rack-id=0"
  1. trigger the cluster to apply new configuration (turn-off of the rack-awareness) by trigerring rebalancing. For this, choose a node and restart it. This will trigger a rebalance and move all nodes to rack-id 0. If running version 3.14.1.1 or above, use the recluster command:
$ asadm> asinfo -v 'recluster:'

Notes

The rack-aware feature is an Aerospike Enterprise Edition Server only feature as of version 4.0.

This will cause a large number of migrations every time a node, which hasn’t yet been restarted, is restarted, until all nodes have been restarted. This is because, the first time a node is restarted, which was previously configured to rackaware, the node-id of that node will change. This causes migrations as the node is inserted into a different place in the succession list.

If downtime is acceptable or possible, an alternative solution may be to restart all nodes at once (as point 3, assuming that you are not using RAM only storage engine) in order to finish performing this.

A slightly more complicated, and potentially dangerous way, would be to adjust the network interface MAC in order to match current node-ids. In a non-rackaware-configuration, the node-id is created using the network card mac address and network port.

For example for a MAC of ff:ee:dd:cc:bb:aa and fabric port 3001, the node-id would be bb9aabbccddeeff (bb9 being hex representation of 3001, followed by reverse of the MAC address).

With this in mind, you could potentially check the current node-id of a node and adjust the network intertface MAC address to ensure that the node-id will remain the same after rack-aware is disabled. Unfortunately, this means, you would have to stop aerospike, restart networking and enable aerospike. Other issues with this approach include potential duplicate MAC on the network and dirty ARP cache. This method is therefore not advised.

Keywords

RACKAWARE REMOVE

Timestamp

1/18/18