Unexpected partition migration state at source


#1

from time to time my aerospike service stop with this log i cant understand what is the problem i am running on ec2 ssd with ebs Version 3.7.5.1

conf looks like this:

# Aerospike database configuration file for deployments using mesh heartbeats.

service {
        user root
        group root
        paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
        pidfile /var/run/aerospike/asd.pid
        service-threads 8
        transaction-queues 8
        transaction-threads-per-queue 4
        proto-fd-max 64000
                paxos-protocol v4
}

cluster {
    mode static
    self-node-id 10
    self-group-id 100
}

logging {
        # Log file must be an absolute path.
        file /var/log/aerospike/aerospike.log {
                context any info
        }
}

network {
        service {
                address any
                port 3000
        }

        heartbeat {
                mode mesh
                port 3002 # Heartbeat port for this node.

                # List one or more other nodes, one ip-address & port per line:

                mesh-seed-address-port XXXX 3002
mesh-seed-address-port XXXX 3002
mesh-seed-address-port XXXX 3002
mesh-seed-address-port XXXX 3002
mesh-seed-address-port XXXX 3002



                interval 250
                timeout 10
        }

        fabric {
                port 3001
        }

        info {
                port 3003
        }
}

namespace test {
        replication-factor 2
        memory-size 10G
        default-ttl 1d # 30 days, use 0 to never expire/evict.
        ldt-enabled true
                high-water-disk-pct 75

        # To use file storage backing, comment out the line above and use the
        # following lines instead.
        #storage-engine device {
        #        file /opt/aerospike/data/test.dat
        #        filesize 120G
        #        data-in-memory true # Store data in memory in addition to file.
        #}

        # Shadow Device for SSD - need to test that also
        storage-engine device {
                device /dev/sdb /dev/sdf
                device /dev/sdc /dev/sdg

               write-block-size 1024K
               max-write-cache 128M
        }
}

and the log:

May 03 2016 12:20:32 GMT: WARNING (drv_ssd): (drv_ssd.c:as_storage_record_read_ssd:1143) {pipeline} read_ssd: invalid rblock_id <Digest>:0x88b1609a68380031f8f7181b86ed6bb34e0201bb
    May 03 2016 12:20:32 GMT: WARNING (drv_ssd): (drv_ssd.c:as_storage_record_read_ssd:1143) {pipeline} read_ssd: invalid rblock_id <Digest>:0x88b1609a68380031f8f7181b86ed6bb34e0201bb
    May 03 2016 12:20:32 GMT: INFO (drv_ssd): (drv_ssd.c::1250) read_all: failed as_storage_record_read_ssd()
    May 03 2016 12:20:32 GMT: CRITICAL (partition): (migrate.c:as_ldt_fill_precord:1912) unexpected partition migration state at source 1:392
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::94) SIGABRT received, aborting Aerospike Community Edition build 3.7.5.1 os el6
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: found 13 frames
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 0: /usr/bin/asd(as_sig_handle_abort+0x34) [0x48cb99]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 1: /lib64/libc.so.6(+0x35670) [0x7f8a0c952670]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 2: /lib64/libc.so.6(gsignal+0x37) [0x7f8a0c9525f7]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 3: /lib64/libc.so.6(abort+0x148) [0x7f8a0c953ce8]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 4: /usr/bin/asd(cf_fault_sink_activate_all_held+0) [0x51c595]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 5: /usr/bin/asd(as_ldt_fill_precord+0xbd) [0x4f313c]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 6: /usr/bin/asd(emigrate_tree_reduce_fn+0x1c6) [0x4f34bf]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 7: /usr/bin/asd() [0x4721ef]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 8: /usr/bin/asd(emigrate_tree+0x40) [0x4f177c]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 9: /usr/bin/asd(emigrate+0x199) [0x4f19e7]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 10: /usr/bin/asd(run_emigration+0xc1) [0x4f26ae]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 11: /lib64/libpthread.so.0(+0x7dc5) [0x7f8a0db25dc5]
    May 03 2016 12:20:32 GMT: WARNING (as): (signal.c::96) stacktrace: frame 12: /lib64/libc.so.6(clone+0x6d) [0x7f8a0ca13c9d]

#2

Are you actually using LDTs here?