I constantly have the same code 11 error when trying to connect to the cluster (only 1 node)
Nov 29 2015 05:30:15 GMT: INFO (paxos): (paxos.c::2367) Cluster Integrity Check: Detected succession list discrepancy between node bb969f9bd270008 and self bb92439d95b8a44
Nov 29 2015 05:30:15 GMT: INFO (paxos): (paxos.c::2412) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes: dun:nodes=bb969f9bd270008
Nov 29 2015 05:30:19 GMT: WARNING (tsvc): (thr_tsvc.c::382) rejecting client transaction - initial partition balance unresolved
Nov 29 2015 05:30:20 GMT: INFO (paxos): (paxos.c::2367) Cluster Integrity Check: Detected succession list discrepancy between node bb969f9bd270008 and self bb92439d95b8a44
Nov 29 2015 05:30:20 GMT: INFO (paxos): (paxos.c::2412) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes: dun:nodes=bb969f9bd270008
So I ran the command
asinfo -v dun:nodes=bb969f9bd270008
And now it still doen’t work, except that the log shows this instead:
[Ignoring succession list mismatch with dunned node bb969f9bd270008 in different cluster]
I don’t know what to do now. The server is online, AMC works too, I just can’t perform any action on the database…
It looks like you have 2 node cluster and the nodes don’t seem to form a healthy cluster, has anything in your network changed? Are you using mesh or multicast clustering mode?
can you share output for command executed from both the nodes in your cluster?
I’m using 3.5.15.
You were right, I had a second single node cluster on another machine in my network, but it wasn’t recognized has a node of the first one in AMC, so I didn’t thought it could be the source of my problem.
I turned it off and now it works again.
The fact that I can’t seem to put 2 single node clusters on the same network causes some problems to my workflow.
The first one, on a physical dedicated machine, is supposed to be available for every one, and the other one, on a VM (with its own internal IP address) for testing purposes.
I’d like some help configuring those so that they coexist on the network without messing with each other.
Each time I tried to do it myself, I ended up either with some conflict issues like the one I just had, or Aerospike just not starting at all…
Hi,
i was facing same issue, when i changed the heartbeat port in /etc/aerospike/aerospike.conf it is working fine.
The reason their could be other Aerospike instances running in your network.
from the logs “discrepancy between node bb969f9bd270008 and self bb92439d95b8a44” i can say it is trying to sync with other nodes in network.
If you are using 2 Aerospike nodes in same network using multi-cast configuration, they are expected to talk to each other. Did you try configuring mesh mode configuration where in you could specify which nodes should be part of which cluster?
Aerospike computes Node ID using MAC address and port, If you are running 2 nodes on same physical box, with same default ports, probably both the nodes are running with same node id. You should run these nodes with different ports (3000-3004 ports for one and other ports for the second node, for example, 4000-4004) this way the node ids computed would be distinct and you should be able to run multiple nodes on same physical machine or inside different VMs.