FAQ: Does migration cause memory fragmentation when using a Data In Memory namespace?
Do migrations as a result of a change in the cluster cause memory to become fragmentated when using Data In Memory for storage?
The answer to this is that no, migrations generally don’t cause long term memory fragmentation as one of the advantages of using jemalloc (which Aerospike uses for it’s memory allocation) is that it tries to avoid memory fragmentation. A lot of cluster changes may cause some short term fragmentation of memory, but this should recover reasonably quickly. As is often the case, though, there are edge cases where memory fragmentation may happen.
To try to explain the situation better, consider the following allocation of memory, consisting of 5 objects, each taking up 2 record blocks in memory:
At this stage we have a heap efficiency of 100% (10 blocks of allocated memory, all in use).
If we now migrate out objects 2 and 4 to another node, we are left with the following (where X denotes space created by the removed objects):
At this stage we have heap efficiency of 60% (10 blocks of allocated memory, 6 in use).
If we now receive a 2 block object called 6 as a migration due to another node being rebooted, we get the following:
At this stage our heap efficiency has increased to 80% (10 blocks of allocated memory, 8 in use).
Now if the next block to be inserted was a bigger 3 block object (called 7), we are unable to fit it between objects 3 and 5, so we will need to allocate additional memory to store it in:
At this stage our heap efficiency is 84.6% (13 blocks of allocated memory, 11 in use).
If the node where object 6 lives now comes back online and we migrate out that data, we get the following:
At this stage our heap efficiency is 69.2% (13 blocks of allocated memory, 9 in use).
Next we received another write of 3 blocks, giving us the following:
At this stage our heap efficiency is 75% (16 blocks of allocated memory, 12 in use).
If we now migrate out object 3, we’ll have the following:
At this stage our heap efficiency is 62.5% (16 blocks of allocated memory, 10 in use).
And finally, if we receive an insert of an even bigger 4 block object (called 9), we now have the space available to store this in the currently available memory allocation, so we would do so:
Giving us a final heap efficiency of 87.5% (16 blocks of allocated memory, 14 in use).
The situation explained here isn’t limited to migrations and can also be seen in situations where you are always updating records to make them larger, or where you often delete smaller records and insert larger ones
The heap efficiency of a node can be seen by looking at the heap_efficiency_pct metric. This is also recorded in the aerospike log file at 10 second intervals.
Use of Secondary Indexes may also result in additional memory fragmentation during migrations, resulting the the heap-efficiency to lower even further.
MEMORY FRAGMENTATION DATA-IN-MEMORY MIGRATION