Hello. After restart of the server, ‘zombie-records’ could appear due to the fact that we’ve deleted the whole set for example. How could we figure out if all of the records that are scheduled for deletion are erased from the SSD? Aerospike 3.3.21
Found existence of deleted and expired records in Aerospike!
Durability issue: lost set after 3-4 days
How do I remove a set?
Updates disappear after recovery nodes from crash
Asbackup linux pipe
Aerospike never deletes data from the disks.
The simplified model:
A write block (wblocks) is written and contains a number of records.
Over time records are updated, expired, evicted or deleted
- This data is never actually removed from the wblock, the index entry is either updated to point to the new location for update or removed from the index.
Eventually the use size of the wblock drops below the defrag-lwm-pct (Default 50%). Which places a reference to this wblock onto the defrag-queue.
When processed by the defrag thread the records in the wblock are staged to be written to a new wblock.
When the use size of a wblock reaches 0 it is placed onto the free queue where it can be overwritten with a new write.
- Until it is overwritten all the data that existed here still exists, just not indexed.
At any time there may be multiple copies of records scattered across the storage layer. These copies may be part of wblocks not yet eligible for defragmentation or eligible but not yet defragmented. Or they may be on the free queue waiting to be overwritten. If the Aerospike daemon is restarted and is either configured with data-in-memory or using the community edition, which doesn’t persist the index to shared memory, the index will need to be rebuilt from disk. Since the data isn’t deleted these records will return.
One solution to this problem is to rely on replication within the cluster and always start nodes empty. To do this use the cold-start-empty configuration option. In the event of a full cluster failure (or failure higher than the replication factor) the data (and the zombie data) can still be recovered by disabling this parameter in the configurations and starting the servers again.
I missed it. Thanks a lot