Cluster RAM still in use after all records deleted

deletion
shared-memory

#1

After deleting all records from a namespace using a Lua script, some of the RAM remains in use. This has continued to grow over multiple cycles of inserting and deleting the data, up to 28.77% of the cluster RAM with no records currently. (Note that this is a RAM-only cluster.)

Any idea what could be using this RAM and how to free it?

In case it’s helpful:

Admin> info namespace
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           Node   Namespace   Evictions    Master   Replica     Repl     Stop     HWM         Mem     Mem    HWM      Stop   
              .           .           .   Objects   Objects   Factor   Writes   Disk%        Used   Used%   Mem%   Writes%   
xx.xx.xx.133      test        164793379   0.000     0.000          2    false   50      26.502 GB      27     60        90   
ip-xx-xx-xx-134   test        171885213   0.000     0.000          2    false   50      26.616 GB      27     60        90   
ip-xx-xx-xx-135   test        197301796   0.000     0.000          2    false   50      28.467 GB      29     60        90   
ip-xx-xx-xx-141   test        205555948   0.000     0.000          2    false   50      32.387 GB      33     60        90   
ip-xx-xx-xx-142   test        174359610   0.000     0.000          2    false   50      27.715 GB      28     60        90   
ip-xx-xx-xx-143   test        184959949   0.000     0.000          2    false   50      29.988 GB      30     60        90   
ip-xx-xx-xx-144   test        172567894   0.000     0.000          2    false   50      25.761 GB      26     60        90   
ip-xx-xx-xx-145   test        217848027   0.000     0.000          2    false   50      33.157 GB      34     60        90   
ip-xx-xx-xx-148   test        184797334   0.000     0.000          2    false   50      29.500 GB      30     60        90   
ip-xx-xx-xx-149   test        197841518   0.000     0.000          2    false   50      30.480 GB      31     60        90   
ip-xx-xx-xx-152   test        160868327   0.000     0.000          2    false   50      28.735 GB      29     60        90   
ip-xx-xx-xx-153   test        217524873   0.000     0.000          2    false   50      33.055 GB      34     60        90   
ip-xx-xx-xx-154   test        172186108   0.000     0.000          2    false   50      27.754 GB      28     60        90   
ip-xx-xx-xx-156   test        187585417   0.000     0.000          2    false   50      28.164 GB      29     60        90   
ip-xx-xx-xx-157   test        214262368   0.000     0.000          2    false   50      32.220 GB      33     60        90

#2

It does seem the usage is slowly shrinking. The last node in that list, for example, is down to 31.943 GB from 32.220 GB. But it’s very slow. Too slow.


#3

If you are running an enterprise release, the primary index is stored in shared memory and grows in 1GiB arenas. This memory isn’t released back to the OS unless you issue a coldstart.

If you run:

ipcs -m

The memory with keys beginning with 0xae are Aerospike primary index arena allocations.


#4

Thanks. I inserted a few million values and the RAM use didn’t increase, so this seems confirmed.

Still, it’s not great from a reporting perspective. Is there any way to see the amount of memory actually in use, as opposed to memory that is simply being held by the process?


#5

Sorry for my confusion, after a code inspection, that metric is “memory actually in use”. So we are back to your original question of why it didn’t go to 0.

Are you using LDTs or Secondary Indexes?

Can you run and provide the output for:

asadm -e "show stat like memory"

#6

After deleting everything in the test namespace, AMC shows Cluster RAM Usage at 468.34 GB (30.61%). It is decreasing over time though.

Here the output of the command:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                    :   xx.xx.xx.133   ip-xx-xx-xx-134   ip-xx-xx-xx-135   ip-xx-xx-xx-141   ip-xx-xx-xx-142   ip-xx-xx-xx-143   ip-xx-xx-xx-144   ip-xx-xx-xx-145   ip-xx-xx-xx-148   ip-xx-xx-xx-149   ip-xx-xx-xx-152   ip-xx-xx-xx-153   ip-xx-xx-xx-154   ip-xx-xx-xx-156   ip-xx-xx-xx-157   
data-used-bytes-memory  :   0              0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 
free-pct-memory         :   72             72                69                65                71                69                72                65                69                68                69                65                71                70                66                
index-used-bytes-memory :   0              0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 
sindex-used-bytes-memory:   30072068030    30110931031       33146980241       37510756617       31327128358       33826243023       30022351873       37714551577       33295054033       34690038764       32989110178       37964214603       31411908698       32105142755       36664371189       
total-bytes-memory      :   109521666048   109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      109521666048      
used-bytes-memory       :   30072068030    30110931031       33146980241       37510756617       31327128358       33826243023       30022351873       37714551577       33295054033       34690038764       32989110178       37964214603       31411908698       32105142755       36664371189       

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~test Namespace Statistics~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                    :   xx.xx.xx.133   ip-xx-xx-xx-134   ip-xx-xx-xx-135   ip-xx-xx-xx-141   ip-xx-xx-xx-142   ip-xx-xx-xx-143   ip-xx-xx-xx-144   ip-xx-xx-xx-145   ip-xx-xx-xx-148   ip-xx-xx-xx-149   ip-xx-xx-xx-152   ip-xx-xx-xx-153   ip-xx-xx-xx-154   ip-xx-xx-xx-156   ip-xx-xx-xx-157   
data-used-bytes-memory  :   0              0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 
free-pct-memory         :   71             71                69                65                70                68                72                64                68                67                69                64                70                70                65                
high-water-memory-pct   :   60             60                60                60                60                60                60                60                60                60                60                60                60                60                60                
index-used-bytes-memory :   0              0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 0                 
memory-size             :   107374182400   107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      
sindex-used-bytes-memory:   30072068030    30110931031       33146980241       37510756617       31327128358       33826243023       30022351873       37714551577       33295054033       34690038764       32989110178       37964214603       31411908698       32105142755       36664371189       
total-bytes-memory      :   107374182400   107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      107374182400      
used-bytes-memory       :   30072068030    30110931031       33146980241       37510756617       31327128358       33826243023       30022351873       37714551577       33295054033       34690038764       32989110178       37964214603       31411908698       32105142755       36664371189       

So it does seem the memory is just being used by the index. How can I quickly free up the memory from the index? Do I need to drop and recreate the index? I suppose I could write an aql script to do that, if necessary.

UPDATE: I tried dropping the index. Unfortunately it made the cluster unresponsive - half the nodes were showing as offline (intermittently) and asadm returned nothing. After 5-10 minutes it did come back, with much lower usage - but it’s still using 168.06 GB (10.98%). So maybe even dropping the indices isn’t a solution.

UPDATE2: One of the two indices still exists. I guess the cluster just failed to delete it. So I dropped it again, and this time it stuck. Next time I’ll have to make sure to drop indices one at a time. At least the memory use dropped quickly and within a minute had hit zero.


#7

It sounds like you are using secondary indexes. If this happens again, you can speed up garbage collection with the following commands:

  • Increase gc-max-units to increase the amount of memory cleaned up by garbage collection: asinfo -v “set-config:context=namespace;id=;gc-max-units=10000”

  • Decrease the time between garbage collection sessions: asinfo -v “set-config:context=namespace;id=;gc-period=500”

What version of Aerospike are you using?

Thank you for your time,

-DM


#8

Thanks for the info. For now I think my best bet is to just drop and recreate the indices, since I want the namespace emptied as quickly as possible between test runs.

Ideally Aerospike would provide support for truncating namespaces.

The version I’m currently testing against is Community Edition 3.6.0.


#9

@Daniel_Siegmann_AOL,

Will you please check that CE version again and provide it?

Our Aerospike Management Console (AMC) has a 3.6.0 version (the latest is 3.6.1), but the Aerospike Server Community Edition’s latest version is 3.5.14.


#10

Sorry, that was the AMC version. We are using build 3.5.14.