Heartbeat expiring caused a "Single Node Cluster"

migration

#1

Hi,

It appears a heartbeat expiry caused a split. But shouldn’t the heartbeat be retried 10 times before the cluster considers a split. Can somebody explain the following log lines please:

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2189) hb considers expiring: now 19873254795 last 19873176931

Apr 29 2016 08:45:24 GMT: INFO (drv_ssd): (drv_ssd.c::2436) device /dev/mapper/vgroot-lv_aerospike: used 27735319680, contig-free 113523M (113523 wblocks), swb-free 3, n-w 0, w-q 0 w-tot 314935 (0.0/s), defrag-q 0 defrag-tot 274909 (0.0/s)

Apr 29 2016 08:45:24 GMT: INFO (info): (thr_info.c::4776) trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 1 ::: iq 0 ::: dq 0 : fds - proto (44, 45327680, 45327636) : hb (10, 136, 126) : fab (86, 363, 277)

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2202) hb expires but fabric says DEAD: node bb9f9d9210a0102

Apr 29 2016 08:45:24 GMT: INFO (info): (thr_info.c::4778) heartbeat_received: self 0 : foreign 1277383066

Apr 29 2016 08:45:24 GMT: INFO (info): (thr_info.c::4779) heartbeat_stats: bt 0 bf 1020308172 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 1 mrf 0 eh 31 efd 21 efa 10 um 0 mcf 84 rc 125

Apr 29 2016 08:45:24 GMT: INFO (info): (thr_info.c::4792) tree_counts: nsup 0 scan 0 batch 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0

Apr 29 2016 08:45:24 GMT: INFO (info): (thr_info.c::4808) namespace fatak: disk inuse: 27735319680 memory inuse: 12134202360 (bytes) sindex memory inuse: 0 (bytes) avail pct 73

Apr 29 2016 08:45:24 GMT: INFO (info): (thr_info.c::4853) partitions: actual 691 sync 1423 desync 0 zombie 0 wait 0 absent 1982

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::137) histogram dump: reads (73465712 total) msec

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2189) hb considers expiring: now 19873254839 last 19873177032

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::154) (00: 0073463809) (01: 0000000730) (02: 0000000542) (03: 0000000410)

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2202) hb expires but fabric says DEAD: node bb9ffcd210a0102

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::154) (04: 0000000122) (05: 0000000060) (06: 0000000033) (07: 0000000003)

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::163) (08: 0000000003)

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2189) hb considers expiring: now 19873254839 last 19873176981

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (154101465 total) msec

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2202) hb expires but fabric says DEAD: node bb9ee8d210a0102

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::154) (00: 0152788441) (01: 0001152965) (02: 0000110941) (03: 0000035487)

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::154) (04: 0000009371) (05: 0000001741) (06: 0000000592) (07: 0000000550)

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::154) (08: 0000001065) (09: 0000000255) (10: 0000000020) (11: 0000000037)

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2189) hb considers expiring: now 19873254839 last 19873176931

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::137) histogram dump: proxy (30 total) msec

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2202) hb expires but fabric says DEAD: node bb9d4b9210a0102

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::1502) as_hb_tcp_send cf_socket_sendto() fd 61 failed

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::154) (00: 0000000005) (01: 0000000008) (02: 0000000003) (03: 0000000006)

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2189) hb considers expiring: now 19873254839 last 19873176981

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::163) (04: 0000000006) (05: 0000000001) (07: 0000000001)

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2202) hb expires but fabric says DEAD: node bb916c9210a0102

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::137) histogram dump: writes_reply (154101194 total) msec

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::154) (00: 0152791647) (01: 0001149832) (02: 0000110877) (03: 0000035484)

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::154) (04: 0000009365) (05: 0000001741) (06: 0000000593) (07: 0000000549)

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::163) (08: 0000001065) (09: 0000000041)

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec

Apr 29 2016 08:45:24 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::1502) as_hb_tcp_send cf_socket_sendto() fd 68 failed

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::1502) as_hb_tcp_send cf_socket_sendto() fd 69 failed

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2382) removing node on heartbeat failure: bb9f9d9210a0102

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::1502) as_hb_tcp_send cf_socket_sendto() fd 286 failed

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2382) removing node on heartbeat failure: bb9ffcd210a0102

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2382) removing node on heartbeat failure: bb9ee8d210a0102

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::1502) as_hb_tcp_send cf_socket_sendto() fd 296 failed

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2382) removing node on heartbeat failure: bb9d4b9210a0102

Apr 29 2016 08:45:24 GMT: INFO (hb): (hb.c::2382) removing node on heartbeat failure: bb916c9210a0102

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1786) fabric: node bb9f9d9210a0102 departed

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1696) fabric disconnecting node: bb9f9d9210a0102

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1786) fabric: node bb9ffcd210a0102 departed

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1696) fabric disconnecting node: bb9ffcd210a0102

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1786) fabric: node bb9ee8d210a0102 departed

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1696) fabric disconnecting node: bb9ee8d210a0102

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1786) fabric: node bb9d4b9210a0102 departed

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1696) fabric disconnecting node: bb9d4b9210a0102

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1786) fabric: node bb916c9210a0102 departed

Apr 29 2016 08:45:24 GMT: INFO (fabric): (fabric.c::1696) fabric disconnecting node: bb916c9210a0102

Apr 29 2016 08:45:24 GMT: INFO (paxos): (paxos.c::2106) departed node bb9ffcd210a0102 is no longer principal pro tempore

Apr 29 2016 08:45:24 GMT: WARNING (paxos): (paxos.c::2116) quorum visibility lost! Continuing anyway …

Apr 29 2016 08:45:24 GMT: WARNING (paxos): (paxos.c::2116) quorum visibility lost! Continuing anyway …

Apr 29 2016 08:45:24 GMT: WARNING (paxos): (paxos.c::2116) quorum visibility lost! Continuing anyway …

Apr 29 2016 08:45:24 GMT: WARNING (paxos): (paxos.c::1890) quorum visibility lost! Continuing anyway …

Apr 29 2016 08:45:24 GMT: WARNING (paxos): (paxos.c::1890) quorum visibility lost! Continuing anyway …

Apr 29 2016 08:45:24 GMT: WARNING (paxos): (paxos.c::1890) quorum visibility lost! Continuing anyway …

Apr 29 2016 08:45:24 GMT: WARNING (paxos): (paxos.c::1890) quorum visibility lost! Continuing anyway …

Apr 29 2016 08:45:24 GMT: WARNING (paxos): (paxos.c::1890) quorum visibility lost! Continuing anyway …

Apr 29 2016 08:45:24 GMT: INFO (paxos): (paxos.c::1933) as_paxos_spark establishing transaction [82.0]@bb94805210a0102 ClSz = 6 ; # change = 5 : DEL bb9f9d9210a0102; DEL bb9ffcd210a0102; DEL bb9ee8d210a0102; DEL bb9d4b9210a0102; DEL bb916c9210a0102;

Apr 29 2016 08:45:24 GMT: INFO (partition): (partition.c::392) DISALLOW MIGRATIONS

Apr 29 2016 08:45:24 GMT: INFO (paxos): (paxos.c::1448) SUCCESSION [82.0]@bb94805210a0102: bb94805210a0102

Apr 29 2016 08:45:24 GMT: INFO (paxos): (paxos.c::2775) SINGLE NODE CLUSTER!!!

Apr 29 2016 08:45:24 GMT: INFO (config): (cluster_config.c::406) Rack Aware is disabled.

Apr 29 2016 08:45:24 GMT: INFO (partition): (cluster_config.c::368) Rack Aware is disabled.

Apr 29 2016 08:45:24 GMT: INFO (partition): (partition.c::2921) CLUSTER SIZE = 1

Apr 29 2016 08:45:24 GMT: INFO (partition): (partition.c::2960) Global state is well formed

Apr 29 2016 08:45:24 GMT: INFO (paxos): (partition.c::2620) setting replication factors: cluster size 1, paxos single replica limit 1

Apr 29 2016 08:45:24 GMT: INFO (paxos): (partition.c::2627) {fatak} replication factor is 1

Apr 29 2016 08:45:24 GMT: INFO (partition): (partition.c::3852) {fatak} re-balanced, expect migrations: out 0, in 0, out-later 0

Apr 29 2016 08:45:24 GMT: INFO (paxos): (partition.c::3876) global partition state: total 4096 lost 1982 unique 2114 duplicate 0

Apr 29 2016 08:45:24 GMT: INFO (paxos): (partition.c::3877) partition state after fixing lost partitions (master): total 4096 lost 0 unique 4096 duplicate 0

Apr 29 2016 08:45:24 GMT: INFO (paxos): (partition.c::3878) 2114 new partition version tree paths generated

Apr 29 2016 08:45:24 GMT: INFO (partition): (partition.c::371) ALLOW MIGRATIONS

Apr 29 2016 08:45:25 GMT: WARNING (cf:socket): (socket.c::371) Error in delayed connect() to 10.33.253.242:3002: timed out

Apr 29 2016 08:45:26 GMT: INFO (hb): (hb.c::2388) new heartbeat received: bb9ffcd210a0102 principal node is bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (hb): (hb.c::2388) new heartbeat received: bb9ee8d210a0102 principal node is bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (hb): (hb.c::2388) new heartbeat received: bb916c9210a0102 principal node is bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (hb): (hb.c::2388) new heartbeat received: bb9f9d9210a0102 principal node is bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (hb): (hb.c::2388) new heartbeat received: bb9d4b9210a0102 principal node is bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (fabric): (fabric.c::1780) fabric: node bb9ffcd210a0102 arrived

Apr 29 2016 08:45:26 GMT: INFO (fabric): (fabric.c::1780) fabric: node bb9ee8d210a0102 arrived

Apr 29 2016 08:45:26 GMT: INFO (fabric): (fabric.c::1780) fabric: node bb916c9210a0102 arrived

Apr 29 2016 08:45:26 GMT: INFO (fabric): (fabric.c::1780) fabric: node bb9f9d9210a0102 arrived

Apr 29 2016 08:45:26 GMT: INFO (fabric): (fabric.c::1780) fabric: node bb9d4b9210a0102 arrived

Apr 29 2016 08:45:26 GMT: INFO (paxos): (paxos.c::2151) Skip node arrival bb9ffcd210a0102 cluster principal bb94805210a0102 pulse principal bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (paxos): (paxos.c::2151) Skip node arrival bb9ee8d210a0102 cluster principal bb94805210a0102 pulse principal bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (paxos): (paxos.c::2151) Skip node arrival bb916c9210a0102 cluster principal bb94805210a0102 pulse principal bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (paxos): (paxos.c::2151) Skip node arrival bb9f9d9210a0102 cluster principal bb94805210a0102 pulse principal bb9ffcd210a0102

Apr 29 2016 08:45:26 GMT: INFO (paxos): (paxos.c::2151) Skip node arrival bb9d4b9210a0102 cluster principal bb94805210a0102 pulse principal bb9ffcd210a0102

Apr 29 2016 08:45:27 GMT: INFO (partition): (partition.c::392) DISALLOW MIGRATIONS

Apr 29 2016 08:45:27 GMT: INFO (paxos): (paxos.c::2869) SUCCESSION [83.0]@bb9ffcd210a0102*: bb9ffcd210a0102 bb9f9d9210a0102 bb9ee8d210a0102 bb9d4b9210a0102 bb94805210a0102 bb916c9210a0102

Apr 29 2016 08:45:27 GMT: INFO (paxos): (paxos.c::2880) node bb9ffcd210a0102 is now principal pro tempore

Apr 29 2016 08:45:27 GMT: INFO (paxos): (paxos.c::2229) Sent partition sync request to node bb9ffcd210a0102

Apr 29 2016 08:45:27 GMT: INFO (paxos): (paxos.c::2623) Ignoring self(bb94805210a0102) add from Principal bb9ffcd210a0102

Apr 29 2016 08:45:27 GMT: INFO (paxos): (paxos.c::2964) received partition sync message from bb9ffcd210a0102

Apr 29 2016 08:45:27 GMT: INFO (config): (cluster_config.c::406) Rack Aware is disabled.

Apr 29 2016 08:45:27 GMT: INFO (partition): (cluster_config.c::368) Rack Aware is disabled.

Apr 29 2016 08:45:27 GMT: INFO (partition): (partition.c::2921) CLUSTER SIZE = 6

Apr 29 2016 08:45:27 GMT: INFO (partition): (partition.c::2960) Global state is well formed

Apr 29 2016 08:45:27 GMT: INFO (paxos): (partition.c::2620) setting replication factors: cluster size 6, paxos single replica limit 1

Also another box in the same cluster has the following logs

Apr 29 2016 08:42:06 GMT: INFO (hb): (hb.c::2189) hb considers expiring: now 19873372618 last 19873371109

Apr 29 2016 08:42:06 GMT: INFO (hb): (hb.c::2202) hb expires but fabric says DEAD: node bb94805210a0102

Apr 29 2016 08:42:06 GMT: INFO (hb): (hb.c::2382) removing node on heartbeat failure: bb94805210a0102

Apr 29 2016 08:42:06 GMT: INFO (fabric): (fabric.c::1786) fabric: node bb94805210a0102 departed

Apr 29 2016 08:42:06 GMT: INFO (fabric): (fabric.c::1696) fabric disconnecting node: bb94805210a0102

Apr 29 2016 08:42:06 GMT: INFO (partition): (partition.c::392) DISALLOW MIGRATIONS

Apr 29 2016 08:42:06 GMT: INFO (paxos): (paxos.c::2869) SUCCESSION [82.0]@bb9ffcd210a0102*: bb9ffcd210a0102 bb9f9d9210a0102 bb9ee8d210a0102 bb9d4b9210a0102 bb916c9210a0102

Apr 29 2016 08:42:06 GMT: INFO (paxos): (paxos.c::2880) node bb9ffcd210a0102 is now principal pro tempore

Apr 29 2016 08:42:06 GMT: INFO (paxos): (paxos.c::2229) Sent partition sync request to node bb9ffcd210a0102

Apr 29 2016 08:42:06 GMT: INFO (paxos): (paxos.c::2964) received partition sync message from bb9ffcd210a0102

Apr 29 2016 08:42:06 GMT: INFO (config): (cluster_config.c::406) Rack Aware is disabled.

Apr 29 2016 08:42:06 GMT: INFO (partition): (cluster_config.c::368) Rack Aware is disabled.

Apr 29 2016 08:42:06 GMT: INFO (partition): (partition.c::2921) CLUSTER SIZE = 5

Apr 29 2016 08:42:06 GMT: INFO (partition): (partition.c::2960) Global state is well formed

Apr 29 2016 08:42:06 GMT: INFO (paxos): (partition.c::2620) setting replication factors: cluster size 5, paxos single replica limit 1

In and around the same time we had the following line in “/var/log/messages” in the boxes respectively

kernel: [19873254.795887] Clocksource tsc unstable (delta = -77745451124 ns)

kernel: [19873905.156181] Clocksource tsc unstable (delta = -114158356442 ns)

From the code it looks like

if (now > p->last + (g_config.hb_interval * g_config.hb_timeout))

So lets say my KVM clocksource messed up and had delta > 1.5secs(150ms * 10 retries), so it would be considered out of the cluster, even though only ONE heartbeat failed. Is the above statement true?

Now it appears it happened the same way in both the boxes, so why is the behaviour different for the two boxes?

Also in all the boxes we see

“(migrate.c::1333) migration insert: cluster key mismatch can’t insert”

Aerospike Configuration

HeartBeat Mode - Mesh

HeartBeat Interval -150

Timeout - 10

Aerospike Version - 3.5.9

Total Nodes in Cluster - 6

OS : Debian_Version 7.8

Apologies for so many questions, any help would be appreciated.

Thanks