Aerospike frees memory only after restart


#1

Hello everyone!

We have an aerospike cluster with 3 nodes. The cluster is configured this way:

service {
  user root
  group root

  paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1
  pidfile /run/aerospike/asd.pid

  batch-threads 20
  service-threads 20
  transaction-threads-per-queue 20
  transaction-queues 20
  proto-fd-max 64000
  nsup-period 30
  migrate-threads 10
  }

logging {
  file /dev/stdout {
    context any info
  }
}

network {
  service {
    address any
    port 3000
    access-address 10.0.0.10
  }

  fabric {
    port 3001
  }

  info {
    port 3003
  }

  heartbeat {
    mode mesh
    port 3002
    interval 150
    timeout 10

    mesh-seed-address-port 10.0.0.10 3002
    mesh-seed-address-port 10.0.0.11 3002
    mesh-seed-address-port 10.0.0.12 3002
      }
}

namespace cache {
  memory-size 232067M

  high-water-memory-pct 65
  ldt-enabled true
  replication-factor 2
  max-ttl 2d
  high-water-disk-pct 65
  cold-start-evict-ttl 172800
  evict-hist-buckets 1000
  evict-tenths-pct 25
  default-ttl 1d

  storage-engine device {
    write-block-size 128K
    data-in-memory false
    scheduler-mode noop

    device /dev/sdc
    device /dev/sdd
    device /dev/sde
    device /dev/sdf
    device /dev/sdg
    device /dev/sdh
      }
}

The problem is that under the load memory usage is near to 50%. Delete queue size is jumping between 10k-80k but memory is not being freed.

However if I will restart aerospike node, memory will be freed and memory usage will fall down to ~35%. After few days it will return to 50-55% usage.

Why if we will restart node the memory will be freed immediately but aerospike did’nt did it by it self?

Aerospike version: 3.14.0

hist-dump:
`cache:ttl=100,576,4367562,4492320,4599695,4431081,4329517,4167087,3897078,4172321,4123110,4113468,4014776,4260717,4273687,4196162,4144367,4301801,4199962,4233925,4190342,4189192,4402587,4455011,4330315,4355613,4593002,4325494,4721685,4794343,4520061,4533337,4685724,4615551,4877885,4729868,4686324,4549452,4634005,4636841,4383236,4449071,4570381,4867610,4596512,4317712,4083665,4279251,4127363,3683730,3547204,3347219,3071370,2674533,2764237,2750570,2329588,2444115,2570768,2397636,2443183,2042160,2032937,1997589,1964052,1912186,1961059,1917431,1906478,1872094,1852461,1797064,1919110,2005704,2088025,1972597,2187630,2245422,2383929,2412038,2516508,2543887,2416244,2598831,2488646,2549478,2377756,2623149,2535747,2901621,2850657,2834021,3049891,3084548,3255680,3342966,3309039,3384844,3920249,3981725,4493944,47755292;`
    Admin> show statistics like expir
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~cache Namespace Statistics~~~~~~~~~~~~~~~~~~~~~~~~~~
    NODE                 :   10.0.0.10:3000   10.0.0.11:3000   10.0.0.12:3000
    expired_objects      :   287917970          2640910366         2059579606
    non_expirable_objects:   0                  0                  0

Thanks!


#2

You can speed up the delete queue processing by decreasing nsup-delete-sleep. Keep in mind that being too aggressive can plug the write paths with nsup deletes.

As for the memory, primary index arena memory isn’t freed, it will be reused if the primary index grows to need it.