Eviction on a memory namespace

Hello,

On a vanilla install of 3.12.1, evict-hist-buckets seems to be set to 100, on a in memory namespace. This is not what the document said. Can it be specific to a in memory namespace ?

I tried to change it in config or by command line to 1000. The value changed in config (asinfo -v “namespace/test” -l | grep evict gives the good result), but asinfo -v 'hist-dump:ns=test;hist=ttl’still dumps an histogram of 100 values.

Am I missing something ?

Thx

As far as I know, you can only have it output 100 buckets from the histogram… http://www.aerospike.com/docs/reference/info/#hist-dump

Since to be improved since 3.8.

While histogram shows 100 buckets, eviction is computed using the finer granularity of default 10,000 buckets or whatever you set it to.

My cluster does not evict, with the error like this one:

Jan 30 2017 02:36:21 GMT: WARNING (nsup): (thr_nsup.c:1043) {test} no records below eviction void-time 222541923 - threshold bucket 361, width 259 sec, count 686375 > target 530312 (0.5 pct)

How can I diagnose the issue ? Any idea to see the histogram if asinfo does display all the histogram ?

I tried to increase the number of buckets, but

  • asinfo does not display more buckets
  • the problem were not solved

Have you gone over this? Eviction mechanisms in Aerospike

Yes.

I have no solution to check if the bucket size option works (asinfo does not output more than 100 buckets, whatever the value).

Increasing evict-tenths-pct does not seems to start eviction.

You have in memory namespace correct? How many node cluster, what is the replication factor for this namespace? What is your memory high water mark set to for this namespace on this node? What is the RAM memory size of this namespace on this node? What are the total number of records (master + replicas) in this namespace on this node? What is the evict-tenths-pct set to for this namespace on this node? That error is from Jan 30, 2017??? Your ver is 3.12.1 which was released May 11, 2017. ??? Perhaps I am missing Can you confirm ver and post log output from this version? Can you post the output of the asinfo hist-dump for tll (100 buckets) for this namespace on this node?

1 Like

My cluster is 3 nodes of 16CPU / 32G RAM

Namespace config:

$ asinfo -v "get-config:context=namespace;id=redis"
repl-factor=2;memory-size=18439208960;default-ttl=3600;enable-xdr=false;sets-enable-xdr=true;ns-forward-xdr-writes=false;allow-nonxdr-writes=true;allow-xdr-writes=true;{redis}-read-hist-track-back=300;{redis}-read-hist-track-slice=10;{redis}-read-hist-track-thresholds=1,8,64;{redis}-query-hist-track-back=300;{redis}-query-hist-track-slice=10;{redis}-query-hist-track-thresholds=1,8,64;{redis}-udf-hist-track-back=300;{redis}-udf-hist-track-slice=10;{redis}-udf-hist-track-thresholds=1,8,64;{redis}-write-hist-track-back=300;{redis}-write-hist-track-slice=10;{redis}-write-hist-track-thresholds=1,8,64;cold-start-evict-ttl=4294967295;conflict-resolution-policy=generation;data-in-index=false;disallow-null-setname=false;enable-benchmarks-batch-sub=false;enable-benchmarks-read=false;enable-benchmarks-udf=false;enable-benchmarks-udf-sub=false;enable-benchmarks-write=false;enable-hist-proxy=false;evict-hist-buckets=10000;evict-tenths-pct=5;high-water-disk-pct=50;high-water-memory-pct=85;ldt-enabled=false;ldt-gc-rate=0;ldt-page-size=8192;max-ttl=2678400;migrate-order=5;migrate-retransmit-ms=5000;migrate-sleep=1;obj-size-hist-max=100;partition-tree-locks=8;partition-tree-sprigs=64;read-consistency-level-override=one;single-bin=false;stop-writes-pct=98;tomb-raider-eligible-age=86400;tomb-raider-period=86400;write-commit-level-override=master;storage-engine=memory;sindex.num-partitions=32;geo2dsphere-within.strict=true;geo2dsphere-within.min-level=1;geo2dsphere-within.max-level=30;geo2dsphere-within.max-cells=12;geo2dsphere-within.level-mod=1;geo2dsphere-within.earth-radius-meters=6371000

Number of record on the node (master / replicas) : 29,047,684 28,714,358

When I change the hwm to start eviction: asinfo -v "set-config:context=namespace;id=redis;high-water-memory-pct=70"

I have following error:

$ tail -f /var/log/aerospike/aerospike.log | grep thr_nsup
May 31 2017 01:37:56 GMT: INFO (nsup): (thr_nsup.c:1109) {redis} Records: 29136407, 0 0-vt, 611129(542999255) expired, 0(0) evicted, 0(0) set deletes. Evict ttl: 0. Waits: 0,0,85089. Total time: 119675 ms
May 31 2017 01:37:57 GMT: INFO (nsup): (thr_nsup.c:1400) nsup clear waits: 1617
May 31 2017 01:37:58 GMT: INFO (nsup): (thr_nsup.c:1181) {redis} nsup start
May 31 2017 01:38:13 GMT: WARNING (nsup): (thr_nsup.c:1057) {redis} no records below eviction void-time 233890678 - threshold bucket 0, width 9 sec, count 717063 > target 145709 (0.5 pct)
May 31 2017 01:40:00 GMT: INFO (nsup): (thr_nsup.c:1109) {redis} Records: 29141821, 0 0-vt, 620427(543619682) expired, 0(0) evicted, 0(0) set deletes. Evict ttl: 0. Waits: 0,0,86766. Total time: 121241 ms
May 31 2017 01:40:01 GMT: INFO (nsup): (thr_nsup.c:1400) nsup clear waits: 1635
May 31 2017 01:40:02 GMT: INFO (nsup): (thr_nsup.c:1181) {redis} nsup start
May 31 2017 01:40:17 GMT: WARNING (nsup): (thr_nsup.c:1057) {redis} no records below eviction void-time 233890802 - threshold bucket 0, width 9 sec, count 720330 > target 145685 (0.5 pct)

Eviction histogram

$ asinfo -v "hist-dump:ns=redis;hist=ttl"    redis:ttl=100,864,2600428,1265011,1353508,1621274,710872,356439,365494,376524,311931,231012,235641,226260,229141,227373,228915,231406,245566,241468,243034,246931,254193,262570,275493,303319,301944,290748,289144,299810,302100,300799,310149,301445,280763,248216,250259,155079,173719,233837,250389,259380,261085,262715,243071,256577,245353,234971,244574,249289,256382,253645,247494,247154,248927,247022,243189,251188,244630,241392,236438,233808,232191,254283,274978,281376,293655,318017,368619,397570,417689,452757,492711,549978,603988,701842,970276,64459,31436,28307,25963,25380,24727,25432,25734,25479,24617,24121,24712,24799,22767,25176,24396,24414,23210,26266,28708,29699,29449,28103,26132,27917;

Changing params: asinfo -v "set-config:context=namespace;id=redis;evict-tenths-pct=100", I’m able to have some eviction.

The change from 0.5 to 100 seems huge. What do you think?

Great, two questions. What is the data you are storing in the record? What is its - average- size? and Second, if your default ttl is 3600 seconds (1 hr) how is your ttl histogram showing records upto 100 x 864 = 86,400 seconds ttl – are you inserting records with ttl > default from the client?

We are using multiple sets in the namespace, so having a clear view of the data is not easy. Any way to have that from monitoring ? Yes, we are storing data with ttl > default ttl. Is it an issue ?

storing with ttl>default is fine, was just trying to understand your ttl histogram vs your namespace config. regarding object size you are storing, i don’t think you can get a objsz historgram for data in memory - you have to have some idea of the size of objects you are writing into your cluster from your client side - avg size - 100 bytes? 1KB? 100KB? … some granularity so some rough estimate can be made of what is happening.

First, evict-hist-buckets is 10,000 by default, ttl histogram always shows 100 buckets, but computations for evictions are based on evict-hist-buckets. By setting it to 1000 you made things 10 times worse for your situation. (Post 3.8.3 think of ttl histogram as ‘birds eye view’!)

Second, when you say 0.5 to 100 – those are two different units your are picking. 0.5 is percent - so that is evict-tenths-pct = 5. evict-tenths-pct=100 implies 100 x 0.1 (tenth) percent = 10% pct. In this case, server will start adding (counting) records in buckets closest to expiration till the count reaches 10% of total records, that is the threshold bucket, it will back off the threshold bucket and evict all buckets below it.

Without going through the full calculation (not knowing record size), try increasing evict-hist-buckets beyond 10,000. (Max is 10 million). When you set to 1000, you are making buckets bigger, your threshold bucket may be the first bucket itself and nothing gets evicted. Hope that helps.