How to remove a node from a cluster


#1

Hi,

I have cluster of 2 nodes. One of nodes (with id BB96A380642A844) after restart began to log the following errors:

Jul 02 2015 08:13:50 GMT: INFO (paxos): (paxos.c::2370) Cluster Integrity Check: Detected succession list discrepancy between node bb9f0f90642a844 and self bb96a380642a844 Jul 02 2015 08:13:50 GMT: INFO (paxos): (paxos.c::2415) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes: dun:nodes=bb9f0f90642a844

BB9F0F90642A844 - is the id of 2nd node.

Node with id BB9F0F90642A844 still shows restarted node as it was in cluster. It shows both nodes when using asadm info command but not shown in aerospike web console.

So, the question is: how to remove node from cluster manually? I found nothing in documentation.

P.S. I ran “cluster dun all” on both nodes with no success after “cluster undun all”

Regards, Alexander


Cluster integrity fault: Unable to create two node cluster
#2

Could you run:

If the cluster integrity fault is still happening could you run:

asadm -e "asinfo -v 'dump-paxos:'"
asadm -e "asinfo -v 'dump-fabric:'"
asadm -e "asinfo -v 'dump-hb:'"

Then targz your aerospike.log and email the file to kevin at aerospike d.ot com.

This will allow us insight into what is happening internally on the server to cause your situation. Could you also provide your aerospike.conf file and the version of Aerospike you are running?

You should just need to run service aerospike stop to remove a node from the cluster. These messages are not expected.


#3

Hi,

In last attempt to resolve the issue I’d restarted aerospike daemon and it helped (hate to do this as loading data from disks takes around 4 hours). Will keep you updated if it happens again (with logs, of course).

Regards, Alex


#4

Alright, keep us posted.