Cluster is having problems after server reboot


#1

We have 3 node in one cluster. Node1 is recognized as other cluster after I rebooted 1 node for maintenance as follows. We want to return Node1 to Cluster1, What should we do to fix this problem?

  • the ordinary

    • Cluster1
      • Node1
      • Node2
    • Node3
  • current status(with problem)

    • Cluster1
      • Node1
      • Node2
    • Cluster1
      • Node3

#2

Hi,

Could you supply to output of

asadm -e "info"

It seems you have all the nodes in the same cluster now, but for some reason not clustering. Depending on your version you may be able to run the following which often clears such issues.

asadm -e "cluster dun all; shell sleep 5; cluster undun all"

If that doesn’t clear the issue, then it may be a multicast issue. Are you using multicast?


#3

Hi,

Thanks for your information. Our server version is Aerospike Server CE 3.3.17. We run the command as you mentioned, but the problem was not solved.

If that doesn’t clear the issue, then it may be a multicast issue. Are you using multicast?

Yes, We are using multicast, Strange to say, Although we have defined “mode multicast” in “/etc/aerospike/aerospike.conf”, the cluster recognized some Node1 as “mode mesh”. What is multicast issue? Is the issue solved in latest version?

best regards,


#4

By “multicast issue” I meant that multicast traffic may be blocked at the network level.

To rule this out I would recommend using mode mesh. You can find instructions for using mesh here: http://www.aerospike.com/docs/operations/configure/network/heartbeat/#mesh-unicast-heartbeat.


#5

Hi,

Thanks for your reply, I’ll find the instruction to use mode-mesh.

best regards,