Can't connect to my cluster

Kicker · November 29, 2015, 5:37am

Hello,

I constantly have the same code 11 error when trying to connect to the cluster (only 1 node)

Nov 29 2015 05:30:15 GMT: INFO (paxos): (paxos.c::2367) Cluster Integrity Check: Detected succession list discrepancy between node bb969f9bd270008 and self bb92439d95b8a44
Nov 29 2015 05:30:15 GMT: INFO (paxos): (paxos.c::2412) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes:  dun:nodes=bb969f9bd270008
Nov 29 2015 05:30:19 GMT: WARNING (tsvc): (thr_tsvc.c::382) rejecting client transaction - initial partition balance unresolved
Nov 29 2015 05:30:20 GMT: INFO (paxos): (paxos.c::2367) Cluster Integrity Check: Detected succession list discrepancy between node bb969f9bd270008 and self bb92439d95b8a44
Nov 29 2015 05:30:20 GMT: INFO (paxos): (paxos.c::2412) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes:  dun:nodes=bb969f9bd270008

So I ran the command

asinfo -v dun:nodes=bb969f9bd270008

And now it still doen’t work, except that the log shows this instead:

[Ignoring succession list mismatch with dunned node bb969f9bd270008 in different cluster]

I don’t know what to do now. The server is online, AMC works too, I just can’t perform any action on the database…

Thanks in advance.

samir · November 29, 2015, 6:37am

Kicker,

Which version of Aerospike Server are you using?

It looks like you have 2 node cluster and the nodes don’t seem to form a healthy cluster, has anything in your network changed? Are you using mesh or multicast clustering mode?

can you share output for command executed from both the nodes in your cluster?

asadm

admin> info

-samir

Kicker · November 29, 2015, 12:59pm

Hi,

I’m using 3.5.15. You were right, I had a second single node cluster on another machine in my network, but it wasn’t recognized has a node of the first one in AMC, so I didn’t thought it could be the source of my problem. I turned it off and now it works again.

Thanks!

Kicker · November 30, 2015, 1:02am

Hi again,

The fact that I can’t seem to put 2 single node clusters on the same network causes some problems to my workflow. The first one, on a physical dedicated machine, is supposed to be available for every one, and the other one, on a VM (with its own internal IP address) for testing purposes.

I’d like some help configuring those so that they coexist on the network without messing with each other. Each time I tried to do it myself, I ended up either with some conflict issues like the one I just had, or Aerospike just not starting at all…

Thanks in advance.

Ranjit_Shinde · November 30, 2015, 4:35am

Hi, i was facing same issue, when i changed the heartbeat port in /etc/aerospike/aerospike.conf it is working fine. The reason their could be other Aerospike instances running in your network. from the logs “discrepancy between node bb969f9bd270008 and self bb92439d95b8a44” i can say it is trying to sync with other nodes in network.

samir · November 30, 2015, 7:25am

Kicker / Ranjit,

If you are using 2 Aerospike nodes in same network using multi-cast configuration, they are expected to talk to each other. Did you try configuring mesh mode configuration where in you could specify which nodes should be part of which cluster?

Aerospike computes Node ID using MAC address and port, If you are running 2 nodes on same physical box, with same default ports, probably both the nodes are running with same node id. You should run these nodes with different ports (3000-3004 ports for one and other ports for the second node, for example, 4000-4004) this way the node ids computed would be distinct and you should be able to run multiple nodes on same physical machine or inside different VMs.

Let me know if you need further help. -samir

Ranjit_Shinde · November 30, 2015, 9:07am

Yes, mesh mode configuration is working well, in that case i don’t need to change heartbeat port.

Kicker · December 1, 2015, 10:53pm

Working good for me as well.

Topic		Replies	Views
Cluster integrity fault Operations	1	2127	January 24, 2016
Problem cluster integrity false on aerospike enterprise 3.9 Configuration	0	1378	September 3, 2016
Cluster Integrity Check: Detected succession list discrepancy at Google Cloud Google Compute Engine (GCE)	2	3928	July 10, 2015
One node (of 6) has integrity problem after a crash and reboot. Will not recover	5	474	January 17, 2024
How to remove a node from a cluster Configuration	3	4412	July 6, 2015

Can't connect to my cluster

Related topics