TTL Histogram bucket size


#1

Is there any way to set the bucket size for the TTL histogram? Almost all our records are showing up in the first histogram bucket, which means this tool is not very useful to us. Here is an example:

asinfo -v "hist-dump:ns=ns1;hist=ttl"  
ns1:ttl=100,20412176,724451269,0,0,0,0,0,0,0,27,0,0,0,0,0,0,0,21,0,0,0,0,0,0,0,2,5,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;

As you can see, there are a few dozen outlier records which are skewing the histogram, and the other 700 million records all end up in the first histogram bucket. This gives us 2 problems: we don’t have visibility into when records will expire, and second, if the eviction process needs to run, it will consider almost of the records as being equivalent, so will basically evict the records randomly.

Is there a way to pass the bucket size into this command? Is there a way to identify those outlier records so that they can be removed in order to get a better distribution? Would changing the max-ttl config setting have any impact?


#2

The max-ttl parameter will reject new writes that exceed the limit and I believe the evict histogram will use that value as the max rather than the max found giving you a better resolution.

Also the nsup also generates another histogram which attempts to get a better resolution, use hist=evict. It chooses the histogram with the best resolution.

No there isn’t.

You would need to run a scan and dump the records with a ttl > 20412176 seconds.


#3

Thanks for the follow-up, Kevin.

I tried looking at the ‘evict’ histogram, and it appears to be blank right now. Does something need to happen before that histogram gets populated? (i.e. objects need to get evicted)


#4

Yes, objects need to be evicted first for the evict histogram to be populated.


#5

From Aerospike 3.8.1 we have introduced a lot more granularity in the TTL histogram, you can now specify number of buckets which, effectively, means the size of the buckets can be controlled. In addition there is no more random eviction from within a bucket, either all records from a given bucket are evicted or the entire bucket remains. It’s now a lot easier to predict and manage evictions.