Hi, We have 10 node cluster running in production. Out of 10 nodes,1 node had performance problems and was shutdown. When it was added back to the cluster we have noticed that once the migrations start, other nodes ASD process has been terminated with following error:
Oct 07 2015 19:33:17 GMT: WARNING (as): (signal.c::161) SIGSEGV received, aborting Aerospike Community Edition build 3.6.1 os el6-------
Before the crash, we see following message
Oct 07 2015 22:25:32 GMT: INFO (paxos): (paxos.c::2410) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes:dun:nodes=bb9f9dd5f290c00,bb9c14a8b565000,bb996068b565000,bb9786f8b565000,bb9607a3f290c00,bb917e213290c00,bb91193a1290c00,bb90c18b1290c00,bb9061f8b565000
This is really concerning. We would love to learn and know how can we avoid this situation.
Atleast, the problematic node should not force other aerospike process to go down.