Adding multiple nodes to a running cluster simultaneously


#1

Hello I am in the process of upgrading a 6-node cluster - basically the plan is to add 6 nodes, then remove the original 6 nodes. In the interest of speeding things up, is it possible/advisable/safe to add more than one node at a time? Let’s say:

  1. Nodes 1-6 are the original nodes, and are up and running, no migrations at present
  2. I install and start aerospike on node 7, migrations start
  3. Before migrations have completed for node 7, I install and start aerospike on node 8, then 9, etc.

Is this safe, or should I wait for migrations to complete on one node, before proceeding to the next?

I know that to prevent data loss I need to do this one at a time on the way down, when removing nodes.


#2

There isn’t an issue with adding multiple nodes simultaneously. If you have found documentation otherwise please let us know where you found it so that we may correct it.

Removing multiple nodes simultaneously will result in data loss. When removing the old nodes, you will want to do it one node at a time.


#3

Thank you! No, I have not seen documentation stating adding multiple nodes at once is not recommended, I just wanted to make sure (overly cautious I guess, doing this on a live production cluster…) Thank you for confirming, I will keep this in mind for the next time I need to do it.


#4

I guess you might want to clarify below statement, as it is contradicting. You can mention before 3.13 version its safer to go one by one., something like that.

As a recommendation – do not add multiple nodes simultaneously to avoid corner cases where the new nodes form a cluster on their own before joining the main cluster, adding more partition versions that would cause the subsequent duplicate resolution to be heavier. We recommend, as best practice, to wait for a new node to successfully join the cluster before adding the next one. https://www.aerospike.com/docs/operations/manage/cluster_mng/adding_node


#5

Hm, the question in this discussion was whether or not it is safe to add multiple nodes simultaneously. I took ‘safe’ to mean without data-loss since that was also being mentioned, for which case it is ‘safe’.

Yes there are some potential performance issues that could occur if the newly added nodes were the form a sub-cluster before joining the full cluster. This performance concern doesn’t affect the safety. We have made efforts to make these events rare but they will always be possible in an imperfect world.

Though, as you mention, before the new clustering in 3.13 there was a period of time where you would be running with a lower than configured replication-factor and losing a node loss during this time could result in data-loss. So prior to the protocol change in 3.13, it is less safe to add multiple nodes simultaneously for replication-factors > 2. Before the protocol change, there ins’t a way to make adding a single empty node safer for replication-factor 2.