Since your server is listening on port 6000 instead of default 3000, add -p 6000 to your asinfo command.
Unrelated, cold-start-empty true leaves you vulnerable to losing all your data should the entire cluster restart after a cluster wide fault. Hope you understand the implications of having that in your config file. What you are saying that always ignore the data in the persistent storage medium when booting this node up. This is generally not recommended.
In your configuration, under network.service, set access-address to the appropriate client reachable address. This configuration is static so you will need to restart each node after configuring.
I think your cluster keeps looking for this non existent node at 172:17:42:1:6000.
Try: asinfo -v ‘tip-clear:host-port-list=172.17.42.1:6000’ -h nodeA -p 6000
Then see if asadm>info shows only the two good nodes. Also, good idea to do an asbackup of your data before trying anything exotic!
Once you have backup, you can try: asinfo -v ‘dun:nodes=0’ -h nodeA -p 6000 because the nodeid seems to be 0 for this non-existent node.
172 (172.17.42.1) returned:
Invalid command or Could not connect to node 172.17.42.1
Does this mean the address that nodeB advertised to the cluster was nodeB-ip1, nodeB-ip2 and 172.17.42.1:6000?
But The “172.17.42.1:6000” is a unknow ip that does not belongs to nodeB.
As from the official doc: Info Command Reference | Aerospike Documentation, the command “asinfo -v service -h nodeB -p 6000” will return a list of IP that nodeB advitesd to other cluster nodes.
asinfo -v service -h nodeB -p 6000
nodeB-ip1:6000;nodeB-ip2:6000;172.17.42.1:6000
It seems that nodeB advertised a ip 172.17.42.1 that does not belong to it to the cluster. So what we need to do is to get rid of that ip. Is it right?
Some other non aerospike process also listening at this port 6000 on nodeB? On node B, can you try using netstat and see what processes are using port 6000?