How to split accidentally merged clusters
An operational accident causing 2 separate clusters to merge, can, in most cases, be reversed with relative ease.
If you are using cluster-name for the clusters and the clients specify the name to connect to, this problem will not happen.
This procedure will only work if the namespaces defined in both clusters have unique names. If the namespaces share the same name across clusters, the data will automatically merge through migrations. At this stage, splitting the cluster would mean that you would have to either recover data from a backup, or retain the merged data in both clusters. The below describes an extra step to perform to retain copy of all data in both clusters after the split, in this scenario (where namespace names were the same across clusters).
- Plan capacity, as both clusters will now hold more data.
- Configure rack-id to ensure that nodes from cluster 1 are in rack 1 and nodes from cluster 2 are in rack 2. This way, you are ensuring that the data from the merged namespace will all be available on both clusters once the split happens. Do this in the aerospike.conf files first, followed by the dynamic change. This is to ensure that should a node restart, it will end up in the correct rack.
- Wait for migrations to complete.
When configuring this, once you have split the clusters using the below steps, you may want to set rack-id back to 0 everywhere by reverting the change in the configuration files and dynamically. Note though that keeping it configured will not cause any issues as long as you ensure that any new nodes added have the rack-id configured as well in the future.
Cluster splitting in all cases
In all cases, to split the cluster, the following process should be followed:
- Identify IPs of both clusters
- Connect to asadm on any node that belongs to CLUSTER 1
- Execute the following to set cluster name (replacing the IP1,… with a list of IPs for CLUSTER 1):
Admin> asinfo -v "set-config:context=service;cluster-name=CLUSTER1" WITH IP1,IP2,IP3
- Exit asadm and execute the following to check if clusters have split:
asadm -e "info cluster"
- In the step above, you should only see nodes displayed for CLUSTER 1, if not, wait and repeat the above step 4 to check. The cluster split can take a few seconds.
- At this stage the clients will be either connected to one side or the other. It may be necessary to restart all the clients in order to force them to connect to the relevant clusters.
- Set the cluster-name configuration parameter to
CLUSTER1in aerospike.conf file on all nodes on CLUSTER 1
- Although not necessary, it is good practice to set the cluster name. Feel free to perform steps
7for CLUSTER 2.
- When next deploying your client code, note that you can set the cluster name within the client policy. This will ensure that the client will only ever connect to the intended cluster, confirming it’s name.
split merged cluster cluster-name