Aerospike Node Entering and Exiting the Cluster Frequently

Hey Guys,

We are experiencing an issue where one of our Aerospike nodes are exiting, then re-entering, then exiting a given cluster on a frequent basis (every few minutes). When this happens, we notice a massive increase in timeout exceptions when trying to get items. We also notice that when this problematic node re-enters the cluster, it will open a massive amount of client connections. We are noticing this through the AMC Dashboard. When we look at the cluster in the AMC Dashboard, we see that the dashboard will show the node to be shutdown, and then display it to be back on every few minutes. Why is this happening?

Can you share the Aerospike logs? Preferably from the system going up and down, and a stable server. We should be able to see timeouts in heartbeats or fabric errors or at the very least cluster size go down to get a good time stamp

I encounter the similar issues because of the ip address conflict. My cluster is all in my own cloud. But in the cloud, there is another vm has the same ip with one Aerospike node, becuase the ip address conflict, the behavior is that node join and then exit, and then join again. Maybe it is not your root cause.

I’m seeing a lot of this message:

Jun 05 2017 18:13:59 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Could that be an issue?

Well yeah, but it’d be nice to see the whole log. And a timestamp of your issue to correlate.

Sorry for the late reply.

Is there a way we can share logs here?

It seems that we can only share jpeg, png, etc.

You can post a good snippet of the logs for that timeframe inside the preformatted text block in a normal reply.

    like this....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Jun 05 2017 17:13:51 GMT: INFO (drv_ssd): (drv_ssd.c::2088) device /dev/sdc: used 1050742784, contig-free 80899M (323596 wblocks), swb-free 2, w-q 0 w-tot 4035 (0.0/s), defrag-q 0 defrag-tot 0 (0.0/s) defrag-w-tot 0 (0.0/s)

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::4979)  system memory: free 14920272kb ( 96 percent free ) 

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::4985)  ClusterSize 11 ::: objects 2450034 ::: sub_objects 0

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::4994)  rec refs 2450098 ::: rec locks 3 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::4999)  replica errs :: null 0 non-null 0 ::: sync copy errs :: master 0 

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::5009)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 3114 ::: iq 2 ::: dq 0 : fds - proto (2847, 435573, 432726) : hb (35, 568, 533) : fab (156, 156, 0)

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::5011)    heartbeat_received: self 11786 : foreign 192816

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::5012)    heartbeat_stats: bt 0 bf 183374 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 6 mrf 0 eh 0 efd 0 efa 0 um 0 mcf 533 rc 533 

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::5024)    tree_counts: nsup 0 scan 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 64

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::5053) {company} disk bytes used 2101061632 : avail pct 98 : cache-read pct 10.92

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::5055) {company} memory bytes used 156802176 (index 156802176 : sindex 0) : used pct 0.97

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::5091) {company} migrations - complete

Jun 05 2017 17:13:53 GMT: INFO (drv_ssd): (drv_ssd.c::2088) device /dev/sdb: used 1050318080, contig-free 80899M (323599 wblocks), swb-free 3, w-q 0 w-tot 4032 (0.0/s), defrag-q 0 defrag-tot 0 (0.0/s) defrag-w-tot 0 (0.0/s)

Jun 05 2017 17:13:53 GMT: INFO (info): (thr_info.c::5098)    partitions: actual 378 sync 364 desync 0 zombie 0 absent 3354

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::137) histogram dump: reads (18024107 total) msec

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::154)  (00: 0017785212) (01: 0000056996) (02: 0000045706) (03: 0000028216)

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::154)  (04: 0000026670) (05: 0000045569) (06: 0000021265) (07: 0000008021)

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::154)  (08: 0000002831) (09: 0000001601) (10: 0000001127) (11: 0000000519)

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::163)  (12: 0000000120) (13: 0000000140) (14: 0000000114)

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (4962 total) msec

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::154)  (00: 0000004884) (01: 0000000005) (03: 0000000001) (04: 0000000002)

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::154)  (05: 0000000012) (06: 0000000014) (07: 0000000009) (08: 0000000008)

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::154)  (09: 0000000004) (10: 0000000010) (11: 0000000008) (12: 0000000005)

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec

Jun 05 2017 17:13:53 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count

Jun 05 2017 17:14:00 GMT: WARNING (demarshal): (thr_demarshal.c::393) dropping incoming client connection: hit limit 15001 connections

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::4979)  system memory: free 14908432kb ( 96 percent free ) 

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::4985)  ClusterSize 11 ::: objects 2450044 ::: sub_objects 0

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::4994)  rec refs 2450107 ::: rec locks 4 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::4999)  replica errs :: null 0 non-null 0 ::: sync copy errs :: master 0 

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::5009)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 7755 ::: iq 0 ::: dq 0 : fds - proto (8469, 485724, 477255) : hb (35, 570, 535) : fab (156, 156, 0)

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::5011)    heartbeat_received: self 11826 : foreign 193469

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::5012)    heartbeat_stats: bt 0 bf 183977 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 6 mrf 0 eh 0 efd 0 efa 0 um 0 mcf 535 rc 535 

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::5024)    tree_counts: nsup 0 scan 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 63

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::5053) {company} disk bytes used 2101065472 : avail pct 98 : cache-read pct 11.92

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::5055) {company} memory bytes used 156802816 (index 156802816 : sindex 0) : used pct 0.97

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::5091) {company} migrations - complete

Jun 05 2017 17:14:03 GMT: INFO (info): (thr_info.c::5098)    partitions: actual 378 sync 364 desync 0 zombie 0 absent 3354

Jun 05 2017 17:14:03 GMT: INFO (info): (hist.c::137) histogram dump: reads (18024257 total) msec

Jun 05 2017 17:14:03 GMT: INFO (info): (hist.c::154)  (00: 0017785215) (01: 0000056997) (02: 0000045708) (03: 0000028216)

Jun 05 2017 17:14:03 GMT: INFO (info): (hist.c::154)  (04: 0000026673) (05: 0000045629) (06: 0000021284) (07: 0000008028)

Jun 05 2017 17:14:03 GMT: INFO (info): (hist.c::154)  (08: 0000002840) (09: 0000001606) (10: 0000001136) (11: 0000000522)

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::163)  (12: 0000000120) (13: 0000000154) (14: 0000000129)

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (4964 total) msec

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::154)  (00: 0000004884) (01: 0000000005) (03: 0000000001) (04: 0000000002)

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::154)  (05: 0000000012) (06: 0000000014) (07: 0000000009) (08: 0000000008)

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::154)  (09: 0000000004) (10: 0000000010) (11: 0000000009) (12: 0000000006)

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec

Jun 05 2017 17:14:04 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:05 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:11 GMT: INFO (drv_ssd): (drv_ssd.c::2088) device /dev/sdc: used 1050746240, contig-free 80899M (323596 wblocks), swb-free 2, w-q 0 w-tot 4035 (0.0/s), defrag-q 0 defrag-tot 0 (0.0/s) defrag-w-tot 0 (0.0/s)

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::4979)  system memory: free 14920392kb ( 96 percent free ) 

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::4985)  ClusterSize 11 ::: objects 2450055 ::: sub_objects 0

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::4994)  rec refs 2450119 ::: rec locks 2 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::4999)  replica errs :: null 0 non-null 0 ::: sync copy errs :: master 0 

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::5009)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 6943 ::: iq 0 ::: dq 0 : fds - proto (7372, 532714, 525342) : hb (35, 572, 537) : fab (156, 156, 0)

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::5011)    heartbeat_received: self 11866 : foreign 194129

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::5012)    heartbeat_stats: bt 0 bf 184582 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 6 mrf 0 eh 0 efd 0 efa 0 um 0 mcf 537 rc 537 

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::5024)    tree_counts: nsup 0 scan 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 64

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::5053) {company} disk bytes used 2101069696 : avail pct 98 : cache-read pct 12.66

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::5055) {company} memory bytes used 156803520 (index 156803520 : sindex 0) : used pct 0.97

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::5091) {company} migrations - complete

Jun 05 2017 17:14:14 GMT: INFO (info): (thr_info.c::5098)    partitions: actual 378 sync 364 desync 0 zombie 0 absent 3354

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::137) histogram dump: reads (18024804 total) msec

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::154)  (00: 0017785232) (01: 0000057004) (02: 0000045710) (03: 0000028226)

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::154)  (04: 0000026723) (05: 0000045853) (06: 0000021362) (07: 0000008041)

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::154)  (08: 0000002858) (09: 0000001622) (10: 0000001149) (11: 0000000522)

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::154)  (12: 0000000136) (13: 0000000182) (14: 0000000183) (15: 0000000001)

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (4969 total) msec

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::154)  (00: 0000004884) (01: 0000000005) (03: 0000000001) (04: 0000000002)

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::154)  (05: 0000000012) (06: 0000000014) (07: 0000000011) (08: 0000000009)

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::154)  (09: 0000000006) (10: 0000000010) (11: 0000000009) (12: 0000000006)

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec

Jun 05 2017 17:14:14 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count

Jun 05 2017 17:14:14 GMT: INFO (drv_ssd): (drv_ssd.c::2088) device /dev/sdb: used 1050323456, contig-free 80899M (323599 wblocks), swb-free 3, w-q 0 w-tot 4032 (0.0/s), defrag-q 0 defrag-tot 0 (0.0/s) defrag-w-tot 0 (0.0/s)

Jun 05 2017 17:14:23 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:23 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

Jun 05 2017 17:14:23 GMT: WARNING (batch): (batch.c::755) Failed to find active batch queue that is not full

And I also found this:

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2226) Skip node arrival bb954b83c5e1312 cluster principal bb9c293e4e9d612 pulse principal bb9e0c593144012

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2226) Skip node arrival bb9d4fc17fe5612 cluster principal bb9c293e4e9d612 pulse principal bb9e0c593144012

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2226) Skip node arrival bb92cf0d60b7412 cluster principal bb9c293e4e9d612 pulse principal bb9e0c593144012

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2226) Skip node arrival bb93ae853e0bb12 cluster principal bb9c293e4e9d612 pulse principal bb9e0c593144012

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2226) Skip node arrival bb9ccaf3e3cd912 cluster principal bb9c293e4e9d612 pulse principal bb9e0c593144012

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2226) Skip node arrival bb9d653b2134612 cluster principal bb9c293e4e9d612 pulse principal bb9e0c593144012

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2226) Skip node arrival bb9e0c593144012 cluster principal bb9c293e4e9d612 pulse principal bb9e0c593144012

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb952f58b6a3512 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb9506ceb7b5412 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb9228915884a12 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb954b83c5e1312 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb9d4fc17fe5612 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb92cf0d60b7412 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb93ae853e0bb12 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb9ccaf3e3cd912 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb9d653b2134612 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::2543) Cluster Integrity Check: Detected succession list discrepancy between node bb9e0c593144012 and self bb9c293e4e9d612

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Paxos List [bb9c293e4e9d612]

Jun 05 2017 17:22:07 GMT: INFO (paxos): (paxos.c::278) Node List [bb9e0c593144012,bb9d653b2134612,bb9d4fc17fe5612,bb9ccaf3e3cd912,bb9c293e4e9d612,bb954b83c5e1312,bb952f58b6a3512,bb9506ceb7b5412,bb93ae853e0bb12,bb92cf0d60b7412,bb9228915884a12]

Jun 05 2017 17:22:07 GMT: INFO (hb): (hb.c::2994) Marking node add for paxos recovery: bb952f58b6a3512

Jun 05 2017 17:22:07 GMT: INFO (hb): (hb.c::2994) Marking node add for paxos recovery: bb9506ceb7b5412

Jun 05 2017 17:22:07 GMT: INFO (hb): (hb.c::2994) Marking node add for paxos recovery: bb9228915884a12

And this:

Jun 05 2017 17:22:01 GMT: INFO (hb): (hb.c::2546) removing node on heartbeat failure: bb92cf0d60b7412

Jun 05 2017 17:22:01 GMT: WARNING (paxos): (paxos.c::2573) SUCCESSION FAULT. Try paxos-recovery-policy 'auto-reset-master' if the problem persists.

Jun 05 2017 17:21:56 GMT: INFO (info): (thr_info.c::4979)  system memory: free 14857412kb ( 96 percent free ) 

Jun 05 2017 17:22:01 GMT: INFO (hb): (hb.c::2546) removing node on heartbeat failure: bb93ae853e0bb12

Jun 05 2017 17:22:01 GMT: INFO (hb): (hb.c::2546) removing node on heartbeat failure: bb9ccaf3e3cd912

Jun 05 2017 17:22:01 GMT: INFO (info): (thr_info.c::4985)  ClusterSize 11 ::: objects 2449995 ::: sub_objects 0

Jun 05 2017 17:22:01 GMT: INFO (hb): (hb.c::2546) removing node on heartbeat failure: bb9d653b2134612

Jun 05 2017 17:22:01 GMT: INFO (info): (thr_info.c::4994)  rec refs 2452660 ::: rec locks 27 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0

Jun 05 2017 17:22:01 GMT: INFO (hb): (hb.c::3035) Marking node removal for paxos recovery: bb9e0c593144012

Jun 05 2017 17:22:01 GMT: INFO (hb): (hb.c::2546) removing node on heartbeat failure: bb9e0c593144012

Jun 05 2017 17:22:01 GMT: INFO (info): (thr_info.c::4999)  replica errs :: null 0 non-null 0 ::: sync copy errs :: master 0 

Jun 05 2017 17:22:01 GMT: WARNING (paxos): (paxos.c::2573) SUCCESSION FAULT. Try paxos-recovery-policy 'auto-reset-master' if the problem persists.

Jun 05 2017 17:22:01 GMT: INFO (fabric): (fabric.c::1761) fabric: node bb952f58b6a3512 departed

Jun 05 2017 17:22:01 GMT: INFO (fabric): (fabric.c::1677) fabric disconnecting node: bb952f58b6a3512

Maybe we need to set the paxos-recover-policy?

I think it’d be more important to find out why you’re losing nodes. Any idea why you have heartbeat failures?