Removal of paxos-single-replica-limit and fail_on_cluster_change


I revisited the latest version of Aerospike and have following queries about it. Can any one please clarify them?

  • “paxos-single-replica-limit” configuration parameter seems to be removed in the server version 6.0. Would like to understand the reason behind it and behavior when replication factor is 2 and number of nodes in the cluster are down to 1.
  • “fail_on_cluster_change” flag has also been removed from scan policy in the latest client. This flag used to let the clients know that the cluster is unstable and there is a chance of inconsistent data being returned. Now since this flag is removed, how would the clients know if the data returned is consistent? Or is it fair to assume that the data will be consistent (even in AP mode)?
1 Like

Good observations and good questions.

  • paxos-single-replica-limit: this was not a great configuration parameter to start with. It had to be set on all nodes at the same time and was not dynamic, which meant it required a whole cluster to be stopped and restarted which would obviously get in the way of any scaling activities where this parameter would have to be adjusted. It also only supported dropping straight to a single replica copy and for users running with 3 or more copies, that could have been too drastic. Having said that you can leverage the migrate-fill-delay configuration to prevent filling up other nodes (only for the Enterprise edition, though) and change the replication-factor dynamically if necessary (available only in AP mode).

  • fail_on_cluster_change: yes, you guessed it right. As of version 4.7 for scans (PI queries) and 6.0 for sindex queries, the client keeps track of which partitions have been processed and can retry / resume the missing parts if there was a cluster change. I wouldn’t compare this to ‘consistent’ at the SC vs. AP level since a query, by nature, takes some time to process and records can change while a query is running, right before or after a query goes over the partition they belong to.

Thanks, Meher, for you reply. It answers my queries