Aerospike aborts and stops


#1

We are having a situation were aerospike just aborts the operations. I have attached the logs. I am not sure what is causing this problem.

Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:169) NODE-ID bb9b093b80e1b90 CLUSTER-SIZE 1
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:249)    system-memory: free-kbytes 65579092 free-pct 99 heap-kbytes (1263118,1263736,1291264) heap-efficiency-pct 97.8
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:263)    in-progress: tsvc-q 0 info-q 0 nsup-delete-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:285)    fds: proto (1,5,4) heartbeat (0,0,0) fabric (16,16,0)
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:294)    heartbeat-received: self 0 foreign 0
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:318)    early-fail: demarshal 0 tsvc-client 2 tsvc-batch-sub 0 tsvc-udf-sub 0
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:348) {st-data} objects: all 13462 master 13462 prole 0
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:409) {st-data} migrations: complete
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:428) {st-data} memory-usage: total-bytes 2598166 index-bytes 861568 sindex-bytes 0 data-bytes 1736598 used-pct 0.06
Dec 07 2016 10:33:42 GMT: INFO (info): (ticker.c:458) {st-data} device-usage: used-bytes 5169408 avail-pct 99
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:169) NODE-ID bb9b093b80e1b90 CLUSTER-SIZE 1
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:249)    system-memory: free-kbytes 65579096 free-pct 99 heap-kbytes (1263117,1263736,1291264) heap-efficiency-pct 97.8
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:263)    in-progress: tsvc-q 0 info-q 0 nsup-delete-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:285)    fds: proto (1,5,4) heartbeat (0,0,0) fabric (16,16,0)
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:294)    heartbeat-received: self 0 foreign 0
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:318)    early-fail: demarshal 0 tsvc-client 2 tsvc-batch-sub 0 tsvc-udf-sub 0
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:348) {st-data} objects: all 13462 master 13462 prole 0
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:409) {st-data} migrations: complete
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:428) {st-data} memory-usage: total-bytes 2598166 index-bytes 861568 sindex-bytes 0 data-bytes 1736598 used-pct 0.06
Dec 07 2016 10:33:52 GMT: INFO (info): (ticker.c:458) {st-data} device-usage: used-bytes 5169408 avail-pct 99
Dec 07 2016 10:33:56 GMT: INFO (drv_ssd): (drv_ssd.c:2092) {st-data} /opt/aerospike/data/st-data.dat: used-bytes 5169408 free-wblocks 16378 write-q 0 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:169) NODE-ID bb9b093b80e1b90 CLUSTER-SIZE 1
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:249)    system-memory: free-kbytes 65579096 free-pct 99 heap-kbytes (1263117,1263736,1291264) heap-efficiency-pct 97.8
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:263)    in-progress: tsvc-q 0 info-q 0 nsup-delete-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:285)    fds: proto (1,5,4) heartbeat (0,0,0) fabric (16,16,0)
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:294)    heartbeat-received: self 0 foreign 0
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:318)    early-fail: demarshal 0 tsvc-client 2 tsvc-batch-sub 0 tsvc-udf-sub 0
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:348) {st-data} objects: all 13462 master 13462 prole 0
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:409) {st-data} migrations: complete
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:428) {st-data} memory-usage: total-bytes 2598166 index-bytes 861568 sindex-bytes 0 data-bytes 1736598 used-pct 0.06
Dec 07 2016 10:34:02 GMT: INFO (info): (ticker.c:458) {st-data} device-usage: used-bytes 5169408 avail-pct 99
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:169) NODE-ID bb9b093b80e1b90 CLUSTER-SIZE 1
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:249)    system-memory: free-kbytes 65565448 free-pct 99 heap-kbytes (1266328,1267756,1326080) heap-efficiency-pct 95.5
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:263)    in-progress: tsvc-q 0 info-q 0 nsup-delete-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:285)    fds: proto (8,12,4) heartbeat (0,0,0) fabric (16,16,0)
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:294)    heartbeat-received: self 0 foreign 0
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:318)    early-fail: demarshal 0 tsvc-client 2 tsvc-batch-sub 0 tsvc-udf-sub 0
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:348) {st-data} objects: all 15163 master 15163 prole 0
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:409) {st-data} migrations: complete
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:428) {st-data} memory-usage: total-bytes 2926459 index-bytes 970432 sindex-bytes 0 data-bytes 1956027 used-pct 0.07
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:458) {st-data} device-usage: used-bytes 5822592 avail-pct 99
Dec 07 2016 10:34:12 GMT: INFO (info): (ticker.c:551) {st-data} client: tsvc (0,0) proxy (0,0,0) read (0,0,0,0) write (1739,0,0) delete (0,0,0,0) udf (0,0,0) lang (0,0,0,0)
Dec 07 2016 10:34:12 GMT: INFO (info): (hist.c:145) histogram dump: {st-data}-write (1739 total) msec
Dec 07 2016 10:34:12 GMT: INFO (info): (hist.c:171)  (00: 0000001724) (01: 0000000007) (02: 0000000008)
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:181) SIGSEGV received, aborting Aerospike Community Edition build 3.10.1 os ubuntu16.04
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:185) stacktrace: found 7 frames
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:185) stacktrace: frame 0: /usr/bin/asd(as_sig_handle_segv+0x48) [0x4accb4]
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:185) stacktrace: frame 1: /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7f9d6665d4b0]
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:185) stacktrace: frame 2: /usr/bin/asd(cf_queue_push+0xc) [0x588cd2]
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:185) stacktrace: frame 3: /usr/bin/asd(ssd_post_write+0x45c) [0x5278ee]
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:185) stacktrace: frame 4: /usr/bin/asd(ssd_write_worker+0x157) [0x527b16]
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:185) stacktrace: frame 5: /lib/x86_64-linux-gnu/libpthread.so.0(+0x770a) [0x7f9d678bf70a]
Dec 07 2016 10:34:12 GMT: WARNING (as): (signal.c:185) stacktrace: frame 6: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f9d6672e82d]

#2

Could you share the configuration file (/etc/aerospike/aerospike.conf) from this server? Was this configuration working for an older server version of Aerospike?


#3
# Aerospike database configuration file for use with systemd.

service {
        paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
        service-threads 4
        transaction-queues 4
        transaction-threads-per-queue 4
        proto-fd-max 15000
}

logging {
        file /var/log/aerospike/aerospike.log {
        context any info
    }
}

network {
        service {
                address 0.0.0.0
                port 3000
                access-address 127.0.0.1 virtual

        }

        heartbeat {
                mode mesh
                address 127.0.0.1
                port 9918

                # To use unicast-mesh heartbeats, remove the 3 lines above, and see
                # aerospike_mesh.conf for alternative.

                interval 150
                timeout 10
        }

        fabric {
           address 127.0.0.1
           port 3001
        }

        info {
           address 127.0.0.1
           port 3003
        }
}

namespace st-data {
        replication-factor 2
        memory-size 32G
        default-ttl 0
        storage-engine device {
                file /opt/aerospike/data/st-data.dat
                #filesize 16G
                data-in-memory true # Store data in memory in addition to file.
        }

Thank you for the support


#4

Thanks. It looks like you have a missing ‘}’ at the end of the configuration (closing parentheses for the namespace stanza). Can you confirm if this is a typo or if this a configuration error?

The stack that you posted looks exactly same as the earlier post Segmentation Fault (SIGSEGV) when enabling data-in-memory - which had similar error in configuration as yours.


#5

Suggestion: If you are running on AWS, for heartbeat configuration,

mode mesh address xx.xx.xx.xx **<-- try using the private IP address of your instance instead of 127.0.0.**1 port 3002 – is typical for heartbeat.