I tried to deploy the community edition of aerospike on kubernetes using the helm chart. I am seeing following error. Can someone point me to the possible reason?
This error is only happening on pod-0 of statefulsets. Other 2 pods are in running state but looks like waiting to connect to pod-0 which is dead.
Sep 24 2020 21:52:39 GMT: INFO (fabric): (socket.c:815) Started fabric endpoint 0.0.0.0:3001
Sep 24 2020 21:52:39 GMT: INFO (hb): (hb.c:7077) mtu of the network is 1496
Sep 24 2020 21:52:39 GMT: INFO (hb): (socket.c:815) Started mesh heartbeat endpoint 0.0.0.0:3002
Sep 24 2020 21:52:39 GMT: INFO (nsup): (nsup.c:187) starting namespace supervisor threads
Sep 24 2020 21:52:39 GMT: INFO (service): (service.c:908) starting reaper thread
Sep 24 2020 21:52:39 GMT: INFO (service): (socket.c:815) Started client endpoint 0.0.0.0:3000
Sep 24 2020 21:52:39 GMT: INFO (service): (service.c:193) starting accept thread
Sep 24 2020 21:52:39 GMT: INFO (info-port): (thr_info_port.c:298) starting info port thread
Sep 24 2020 21:52:39 GMT: INFO (info-port): (socket.c:815) Started info endpoint 0.0.0.0:3003
Sep 24 2020 21:52:39 GMT: INFO (as): (as.c:408) service ready: soon there will be cake!
Sep 24 2020 21:52:39 GMT: INFO (hb): (hb.c:6364) removing self seed entry host:aerospike-0.aerospike.default.svc.solutions3.diamanti.com port:3002
Sep 24 2020 21:52:39 GMT: INFO (hb): (hb.c:5740) removed mesh seed host:aerospike-0.aerospike.default.svc.solutions3.diamanti.com port 3002
Sep 24 2020 21:52:39 GMT: INFO (hb): (hb.c:4357) found redundant connections to same node, fds 27 22 - choosing at random
Sep 24 2020 21:52:40 GMT: INFO (clustering): (clustering.c:6355) principal node - forming new cluster with succession list: bb907801c00a28e
Sep 24 2020 21:52:40 GMT: INFO (clustering): (clustering.c:5794) applied new cluster key 5381145975f5
Sep 24 2020 21:52:40 GMT: INFO (clustering): (clustering.c:5796) applied new succession list bb907801c00a28e
Sep 24 2020 21:52:40 GMT: INFO (clustering): (clustering.c:5798) applied cluster size 1
Sep 24 2020 21:52:40 GMT: INFO (exchange): (exchange.c:2319) data exchange started with cluster key 5381145975f5
Sep 24 2020 21:52:40 GMT: INFO (exchange): (exchange.c:2671) exchange-compatibility-id: self 7 cluster-min 0 -> 7 cluster-max 0 -> 7
Sep 24 2020 21:52:40 GMT: INFO (exchange): (exchange.c:3219) received commit command from principal node bb907801c00a28e
Sep 24 2020 21:52:40 GMT: INFO (exchange): (exchange.c:3182) data exchange completed with cluster key 5381145975f5
Sep 24 2020 21:52:40 GMT: INFO (partition): (partition_balance.c:993) {test} replication factor is 1
Sep 24 2020 21:52:40 GMT: INFO (partition): (partition_balance.c:965) {test} rebalanced: expected-migrations (0,0,0) fresh-partitions 0
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:168) NODE-ID bb907801c00a28e CLUSTER-SIZE 1
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:249) cluster-clock: skew-ms 0
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:272) system: total-cpu-pct 0 user-cpu-pct 0 kernel-cpu-pct 0 free-mem-kbytes 121374476 free-mem-pct 92
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:290) process: cpu-pct 1 heap-kbytes (1081944,1082472,1105920) heap-efficiency-pct 97.8
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:301) in-progress: info-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:322) fds: proto (0,0,0) heartbeat (0,2,2) fabric (0,0,0)
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:330) heartbeat-received: self 2 foreign 0
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:360) fabric-bytes-per-second: bulk (0,0) ctrl (0,0) meta (0,0) rw (0,0)
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:423) {test} objects: all 0 master 0 prole 0 non-replica 0
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:483) {test} migrations: complete
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:501) {test} memory-usage: total-bytes 0 index-bytes 0 sindex-bytes 0 data-bytes 0 used-pct 0.00
Sep 24 2020 21:52:49 GMT: INFO (info): (ticker.c:560) {test} device-usage: used-bytes 0 avail-pct 99
Sep 24 2020 21:52:57 GMT: WARNING (as): (signal.c:129) SIGFPE received, aborting Aerospike Community Edition build 5.0.0.4 os debian9
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:604) stacktrace: registers: rax 0000000000000000 rbx 00007fbb6f5f6e14 rcx 0000000000000000 rdx 0000000000000000 rsi 0000000000000000 rdi 00007fbbb380d008 rbp 00005601a972c300 rsp 00007fbb6f5f6c80 r8 0000000000000001 r9 0000000000000000 r10 0000000000000000 r11 000e6e26eb55e7c8 r12 00005601a9734300 r13 00005601a972b240 r14 00007fbbb380d050 r15 00007fbb6f5f6e10 rip 00005601a9294250
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:617) stacktrace: found 8 frames: 0x169c33 0xb9554 0x7fbbfc1a00e0 0xc8250 0xc85a3 0x157060 0x7fbbfc1964a4 0x7fbbfb015d0f offset 0x5601a91cc000
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:627) stacktrace: frame 0: asd(cf_log_stack_trace+0xe8) [0x5601a9335c33]
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:627) stacktrace: frame 1: asd(as_sig_handle_fpe+0x62) [0x5601a9285554]
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:627) stacktrace: frame 2: /lib/x86_64-linux-gnu/libpthread.so.0(+0x110e0) [0x7fbbfc1a00e0]
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:627) stacktrace: frame 3: asd(+0xc8250) [0x5601a9294250]
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:627) stacktrace: frame 4: asd(+0xc85a3) [0x5601a92945a3]
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:627) stacktrace: frame 5: asd(+0x157060) [0x5601a9323060]
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:627) stacktrace: frame 6: /lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4) [0x7fbbfc1964a4]
Sep 24 2020 21:52:57 GMT: WARNING (as): (log.c:627) stacktrace: frame 7: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fbbfb015d0f]
Gmail filtered this messages so couldnt see till now. and these messages were not showing up on the discussion thread. I am not falimir with "hardware logging context. Do you want to me to put that json block as it is in the conf file? or add some hardware detail there?
Sorry for the confusion, I removed those comments because they were a dead end. If the server had found zero cpus it would have crashed well before the point where this crash occurred.
Hi @kporter, for some reason when I deployed it again last night now aerospike cluster is working. I don’t see any error. Which is good but same time very odd as I have not changed/fixed anything to make it work. Not sure if its helpful now but below is the output of cmd you asked for: