Consistency issue with Aerospike

I am using in Aerospike (v3.5.12) as in memory key value store of one node.

Also using Java client(v3.1.7) to retrieve and write data.

I noticed that under certain QPS (3K) of reads(Batch) and writes. Some of the data retrieval doesn’t work.

[INFO] [24/11/2015 14:43:38.782] Write Data
[ERROR] [24/11/2015 14:43:38.937] Read Data - Not Found

Did someone encountered with similiar issue?

Alex,

There is no known such issue. Could you check it with new 3.7.0.2 release where lot improvements related sindex code have been introduced.

HTH – Raj

Could it have been possible the cluster was performing migrations when the reads were not found? Prior to 3.6.x the batch system would fail with notfound during migrations if the record no longer resided on the target node.

I am working on one node, also single get are not working too.

Did the server ack the write with a successful return code before the read was initiated?

I am using the async oparations in the following matter:

    public void store() {
        WritePolicy expirationWritePolicy = new WritePolicy();
        expirationWritePolicy.sendKey = true;
        expirationWritePolicy.priority = Priority.HIGH;
        expirationWritePolicy.expiration = 10;

        Key key = new Key(namespace, SET_NAME, requestId);
        Bin bin = new Bin(BIN_NAME, serializer.toBinary(budgetCommit));
        Bin extra = new Bin("extra", "data");

        client.put(expirationWritePolicy, new WriteListener() {
            @Override
            public void onSuccess(Key key) {
                logger.info("Succeed to store {}", requestId());
            }

            @Override
            public void onFailure(AerospikeException exception) {
                logger.error(exception, "Fail to store {}", key);
            }
        }, key, extra, bin);
    }

  public void retrieve() {
        WritePolicy defaultWritePolicy = new WritePolicy();
        defaultWritePolicy.priority = Priority.LOW;
        defaultWritePolicy.sendKey = true;

        Key key = new Key(namespace, SET_NAME, requestId);
        Bin closeExtra = new Bin("extra", "_closed");

        client.operate(defaultWritePolicy, new RecordListener() {
                    @Override
                    public void onSuccess(Key key, Record record) {
                         if (record.getValue(BIN_NAME) == null){
                               logger.error("Fail to retrieve {}", requestId);
                         }
                    }

                    @Override
                    public void onFailure(AerospikeException exception) {
                        logger.error("Fail to retrieve {} : {}", requestId, exception.getMessage());
                    }
                }, key,
                Operation.append(closeExtra), Operation.get());
    }

[INFO] [12/01/2016 08:37:16.732] Succeed to store 379e67dc-945d-4717-97a7-721cc8093c05 [ERROR] [12/01/2016 08:37:16.736] Fail to retrieve 379e67dc-945d-4717-97a7-721cc8093c05

The onSuccess callback is called when there is an Ack from the Aerospike.

Starting to fail around 8k QPS on master write.

Ah, my suspicion is that you have breached an eviction high water mark, either memory or disk. We can confirm this by running asadm -e "info namespace" and checking if HWM Mem% or HWM Disk% is above their respective Used% values.

I suspect you may have at least two TTL values that differ significantly. This use pattern will cause the lower ttl to be purged when eviction kick in. For this use case we have set-disable-eviction which will exclude a particular set. The configuration reference page shows how to dynamically set this option for a set, and a static configuration example can be found in the set-data-retention confiuguration documentation.