Efficient Defrag (3.4.0)


#1

In our use case, the data in DB (In-Memory + Disk Persistence) only stay for a period of time (minutes ~ hours, average ~ 1 hour) and will be completely deleted.

Aerospike has default "defrag-lwm-pct 50”, that means a disk Block (default 1MB) is eligible for Defrag as soon as its valid data <50%, and those remaining data will be re-written into a new Block (mixed with the new incoming data stream), leaving previous Block to be reusable again. The frequent Read to previous Block during Defrag will cause heavy disk I/O per our observation in lab.

This is inefficient for our use case, since our App will finally delete those data after a period of time — Doing a nature “Defrag”. The data in a Block may be deleted by our App just after Aerospike done Defrag to it –- waste Defrag. We may use “defrag-lwm-pct 0” to disable Defrag, but it is not safe since it may have some data with long period of time in DB, e.g.: 6 hours. So for Blocks with those data must be defraged if “Avail Pct%” is low to avoid “Disk FULL” error.

Suggest Aerospike to track the last changed (Delete) timestamp of a Block. The one matched "defrag-lwm-pct” and with older changed timestamp will be eligible for Defrag fist when “Avail Pct%” is low. That is to do Defrag for long idle Blocks only.

Also accumulate the remaining data in a separate buffer before re-written into a new Block, instead of mixing with the new incoming data stream — to avoid repeated Defrag to those long idle data.


#2

Hi Hanson, thank you for your feedback, we have discussed this with our engineers, we understand that the current defrag algorithm may not be ideal for all edge cases and will consider your particular case in future iterations.


#3

Since 3.4.0, defrag-queue-min has been available which allows the operator to force some delay before a block is defragmented.

Since 3.6.1 the defrag algorithm was changed to not mix new records from client or replica with records from defragmented blocks.