Expired records takes old values and update TTL

Eliya_Ben_Abu · December 25, 2018, 8:43am

Hello, I’m using aerospike community edition version 4.4.0.4, I have records that used as counters with TTL, when i’m updating the record i set the TTL to not be updated, when the record is expired I expected the record to act like a new record but instead I see the old values taken and increment.

My questions is:

Is expired records that gets increment operation takes old value?
Is it a bug?
I do not see in server code (write_master, as_record_get_create, as_index_get_insert_vlock functions) any consideration in TTL before update the record.

I do not see this behavior in version 3.14.0, I used UDF functions to increment record bin’s.

Albot · December 26, 2018, 3:18am

How did you validate the record expired? Are you sure nsup cleaned out the record before you wrote a new one?

Eliya_Ben_Abu · December 26, 2018, 6:39am

When I say record expired I mean the TTL is past,
If record expired only when nsup cleaned it out why the TTL is updated while I’m sending it as don’t update TTL.
From my view perspective If record expired when the TTL past why the old values incremented, it should act like new record.

kporter · December 26, 2018, 6:40am

Even if NSUP hadn’t deleted the record yet, expired records are supposed to be treated as non-existent.

Eliya_Ben_Abu · December 26, 2018, 6:45am

So we agree that old values should not be incremented, this is not the behavior here.

kporter · December 26, 2018, 6:54am

Correct, could you post your full Aerospike configuration?

Eliya_Ben_Abu · December 26, 2018, 7:04am

service {
   user aerospike
    group aerospike

    paxos-single-replica-limit 1

    nsup-period 10
    scan-threads 4
    migrate-threads 2
    proto-fd-max 21000
    transaction-threads-per-queue 4
}

logging {
    file /var/log/aerospike/aerospike.log {
            context any info
    }
}

network {
    service {
            address any
            port 3000
    }
    heartbeat {
            mode mesh
            port 3002
            address 10.240.0.32

			mesh-seed-address-port 10.240.0.32 3002
			mesh-seed-address-port 10.240.0.99 3002
			mesh-seed-address-port 10.240.0.7 3002
            

            timeout  60
            interval 500
    }
    fabric {
            port 3001
    }
    info {
            port 3003
    }
}

namespace reputation {
    default-ttl 3d
    memory-size 22G
    migrate-sleep 1500
    stop-writes-pct 85
    replication-factor 2
    high-water-memory-pct 65
    conflict-resolution-policy last-update-time
    disable-write-dup-res true
    partition-tree-sprigs 4096

    storage-engine device {
           device /dev/sdb

           scheduler-mode noop
           write-block-size 128K

           # LRU cache.
           # num_of_devices x block_size x post_write_queue: 1 x 128 x 4096
           # Will consume 512MB RAM
           post-write-queue 4096
   }
}

kporter · December 26, 2018, 7:27am

What is the size of /dev/sdb?

Could you also share a code snip that demonstrates how you are updating the record? Specifically, I’m looking to see if this is a write through UDF or a regular write. Also the exact policies being used with the transaction.

Your config is a bit complex, it would be huge assist if could you please try finding the simplest configuration that you are able to reproduce this issue with.

kporter · December 26, 2018, 7:57am

I quickly scanned through write path in master, these expired/truncated records are referred to as ‘doomed’ in the source, the main write path checks this early and treats it as a new record when the existing is ‘doomed’.

I see you’ve updated the original post that you are using a UDF, possibly the UDF source failed to perform these checks.

Eliya_Ben_Abu · December 26, 2018, 8:09am

The size of /dev/sdb is 1T and 137 GB in use.
I can reproduced this issue also on our staging instance that it is basically empty.
Go client code snippet:

    func Test_TTLIssue(t *testing.T) {
	    //t.Skip("local tests only")
	
	operations := aerospike.AddOp(aerospike.NewBin("js_total", int64(1)))

	newascl, err := as.Client("aes-terra-staging:3000?logger=DEBUG"); if err != nil {
		t.Fatal("Failed to connect to aerospike")
	}
	
	writePolicy := aerospike.NewWritePolicy(0, uint32(10))
	writePolicy.Timeout = 0
	writePolicy.SocketTimeout = 0

	rowKey, _ := aerospike.NewKey(
		"reputation",
		"counters",
		"key-10",
	)

	rs, er := newascl.Operate(writePolicy, rowKey, operations)
	fmt.Println(rs)
	fmt.Println(er)
	
        # Sleep more than the TTL.
	time.Sleep(10 * time.Second + 200 * time.Millisecond)
	operations = aerospike.AddOp(aerospike.NewBin("js_total", int64(1)))
	rs, er = newascl.Operate(writePolicy, rowKey, operations)
	fmt.Println(rs)
	fmt.Println(er)
    }

TTLIssue

kporter · January 2, 2019, 9:20pm

I’ve been unable to reproduce this locally.

If you are still able to reproduce this issue could you also provide the output of:

asadm -e 'info'

Also could you increase the amount you are sleeping between the operations? TTLs have resolution to the a second so adding a sub-second sleep would have non-deterministic results. See if you can reproduce this with a 15 second delay.

Eliya_Ben_Abu · January 13, 2019, 9:36am

Thanks for your time, for some reason I also failed to reproduced it again locally, I will try to understand what change cause it, but the issue looks like something that I did wrong, I will let you know if I find it.

kporter · January 14, 2019, 10:18pm

As I mentioned, expiration has a 1 second resolution, which means the record could take up to 16 seconds (from the time it arrived at the server) to expire. You were only sleeping 15.2 seconds so the result wouldn’t be deterministic. You would need to sleep for at least 16 seconds plus however long the transaction took rounded up to a second.

Topic		Replies	Views
Is it possible to keep expiration/TTL when updating records? (AER-3123) Delivered Requests expiration	10	5739	November 14, 2017
Records with ttl set never have their expiration time shortened	4	1108	March 17, 2022
On TTL when updating the record with the java client Java Client	3	2645	May 30, 2017
Maintain TTL value on record update or add fixed-timestamp expiration (3.10.0) Delivered Requests	6	4709	February 20, 2020
Data expiration expiration	2	2480	June 1, 2018

Expired records takes old values and update TTL

Related topics