Namespace not clean up properly after migration

Igor_Kochergin · January 18, 2017, 10:35am

Hi Problem occur after we add new clean node to cluster.

When replication done, we get different replication and master objects count. Namespace is not balanced. The number of master objects continues to grow

We get identical problem on earler releases, from 3.6.1.

Our solution earlier is to run ‘cluster dun all’ and ‘cluster undun all’ new migration procces starts and problem disapeaerd after migration ended.

But in last version we can`t run these command.

https://discuss.aerospike.com/t/asadm-cluster-dun-all-invalid-command-or-could-not-connect-to-node/3775

aerospike-server-community 3.11.0.2-1 aerospike-tools 3.11.0

TimF · January 19, 2017, 10:05pm

Dun was removed in 3.9.1 because it was no longer necessary due to the enhanced paxos algorithm. The cluster should auto-heal and auto-rebalance.

A few questions:

Are all you nodes on the same version? (3.11.0.2)
What is your paxos-recovery-policy set to? It should be auto-reset-master, which is the default in the version you’re using.
If you run asadm -e info, do all nodes agree on on the size of the cluster, and have cluster visibility true?
Do you have any errors in your logs, particularly network errors?

I suspect you have network issues between your nodes, either because of networking issues or your Aerospike is mis-configured. Can you give out any information about deployment in terms of (a) bare metal vs cloud, (b) number of NICs in the nodes and (c) how those NICs are used?

I would also note that your stop-writes, high-water-mark-memory and high-water-mark-disk parameters are set oddly. These are typically 90%, 60% and 50% respectively, yours are 80%, 80%, 99%. There are serious ramifications of mis-configuring these, be aware of what these ramifications are before they bite you in production.

Topic		Replies	Views
Restore cluster trouble	2	1364	May 27, 2016
Aerospike not forming cluster after upgrading to version 3.4 Operations	5	2876	January 14, 2015
Unbalanced migration between nodes (Aerospike 4.0.0.5) Tuning	1	657	October 8, 2021
Why Aerospike evicted data? Configuration	2	5638	June 12, 2017
Cluster is having problems after server reboot Operations	4	1725	February 6, 2015

Namespace not clean up properly after migration

Related topics