We have Aerospike running on 14 nodes and they have been running for almost a year now with no issues. We have one node now that is having a very strange problem and I can’t seem to find the solution. When you try to use the asadm tool you get this
Found 14 nodes
Online: 10.100.0.138:3000,**127.0.0.1:3000**, 10.100.0.149:3000, 10.100.0.154:3000, 10.100.0.143:3000, 10.100.0.153:3000, 10.100.0.150:3000, 10.100.0.152:3000, 10.100.0.151:3000, 10.100.0.155:3000, 10.100.0.140:3000, 10.100.0.137:3000, 10.100.0.139:3000, 10.100.0.142:3000
Cluster Visibility error (Please check services list): 10.100.0.138:3000, 10.100.0.149:3000, 10.100.0.154:3000, 10.100.0.143:3000, 10.100.0.153:3000, 10.100.0.150:3000, 10.100.0.152:3000, 10.100.0.151:3000, 10.100.0.155:3000, 10.100.0.140:3000, 10.100.0.137:3000, 10.100.0.139:3000, 10.100.0.142:3000
Notice the 127.0.0.1 that is bold? I have no idea why that is the case! If you run it again it might get the correct 10.100.0.x address or it might use the 127.0.0.1.
The other odd thing is that there is a 127.0.0.1 to node mapping that is incorrect as well.
Also, this is what I see on the server that is having an issue. There should only be 14 nodes, not sure why the local ip is bound to the node.
< ~IP to NODE-ID Mapping~
IP NODE-ID
10.100.0.136:3000 BB9ACD16E7AC40C
10.100.0.137:3000 BB918D66E7AC40C
10.100.0.138:3000 BB908D76E7AC40C
10.100.0.139:3000 BB9DED56E7AC40C
10.100.0.140:3000 BB9E2D56E7AC40C
10.100.0.142:3000 BB9E2D26E7AC40C
10.100.0.143:3000 BB980D76E7AC40C
10.100.0.149:3000 BB912BFDE7AC40C
10.100.0.150:3000 BB906BFDE7AC40C
10.100.0.151:3000 BB9DA1BDF7AC40C
10.100.0.152:3000 BB9F01CDF7AC40C
10.100.0.153:3000 BB91C1BDF7AC40C
10.100.0.154:3000 BB96E95987AC40C
10.100.0.155:3000 BB9EE0E9C7AC40C
127.0.0.1:3000 BB9ACD16E7AC40C
Number of rows: 15
Any help with this? I have added the access-address to the aeropsike configuration file but nothing has worked.
All help is appreciated as this is causing some issues in our production environment.
Thanks!