Hi all, I use Aerospike community version 3.8.3. I set TTL for some records (persistence in my disk), and they will expired in near future. This temporary data is huge and take a big volume of disk, therefore, I expected them deleted from my disk in time when they are expired and free the disk they taken before.
I want to know, when aerospike evict this expired records from my disk? At next cold start, or when disk usage meet high water level? Is it possible these expired records appear after code start ? Need I worry about this expired records eaten up all my disk? I can make sure they expired in time and never use out of my disk, but I don’t know if they can be evicted in time.
when aerospike evict this expired records from my disk?
eviction and expiration are very different things. Expiration is when the record expires and gets deleted at the specified TTL time. Eviction happens when you do not have enough capacity to handle your current load - so when you breach HWM, Aerospike “evicts” the oldest records - basically a premature expiration to protect your capacity.
Records are expired and deleted in the nsup cycle, which runs frequently. FAQ What are Expiration, Eviction and Stop-Writes?
At next cold start, or when disk usage meet high water level? Evictions happen at HWM. At cold start, it depends on how your namespace is configured. If its a RAM-based namespace, cold start means you just start your server with NO records at all - regardless of TTL.
Is it possible these expired records appear after code start ?
I assume you mean cold start here. If you are persisting data to disk, check if you have “cold-start-empty true” in your namespace configuration. If you don’t, then data will be scanned off the disk and brought back into the cluster. It depends on your use case as to how to handle maintenance - can you tolerate zombie records, are you very sensitive to data integrity, do you have a multi node cluster, do you have capacity to allow for ((1/n-1)+1) * max% increase in data on the other nodes, etc…
Need I worry about this expired records eaten up all my disk?
If you do capacity planning correctly, no. Expired records and other deleted/rewritten records are queued for cleanup later as part of a “defrag” cycle. Aerospike recommends you keep your disk below 50% to allow for defrag to occur. If using a memory based namespace, you do not need to worry about defrag. Defragmentation
I can make sure they expired in time and never use out of my disk, but I don’t know if they can be evicted in time.
Can you elaborate what the question is here? Are you asking if eviction is happening fast enough? In a perfect world, if you did capacity planning sufficiently, you shouldn’t have evictions.
This is true for deleted records but not evicted or expired records. When evictions occur, the eviction depth is stored to the drive header. During coldstart evicted and expired records are not indexed.
I’m not sure if that’s entirely true @kporter. If that was the case, expiration would be preferable to deletion. When my company experienced the zombie record issue, one item for discussion was setting touching the records with a low TTL over performing deletion. I think the final consensus was that both can come back as zombie records.
@Albot the expired/evicted record isn’t the version that comes back in your scenario. As I said, if the record is expired or evicted it will not be indexed on coldstart, however, if a prior version of such record is found further along the disk scan and it isn’t expired or evicted then it will be indexed.
Using expiration and eviction typically does yield a better solution; but, to be rigorous, requires that the TTLs never decrease. These solutions are better as they do not require tombstone management overhead, as well as not needing to create your own eviction like routine.