One of my cluster node has suddenly started giving this error in large number. Shot to 4477690.0 from 0 in around 15 minutes. And it’s continuously growing.
I am not able to gather the meaning and significance of this from the metrics page “Number of errors during cluster state exchange because of missing general node information”.
This node is reachable from other nodes in cluster(checked through ping), there are no migrates happening (all are 0,0 in logs), Foreign heartbeat count is same as in another nodes. There is nothing suspicious in logs as well which exhibit any problem.
Can someone please help in understanding this metrics and if I should be worried about it ? As of now, We have an alert over all err* metrics(through aerospike collectd plugin ) and perhaps if this is not serious, we will perhaps remove this alert. Thanks,