Solution - Durable delete tombstones never removed in system where load is extremely low

Solution: Durable delete tombstones never removed in system where load is extremely low.

Problem Description

When a record is deleted durably, a tombstone is created as a marker to that record. To prevent resource issues, tombstones are periodically removed by a process called the tomb raider. In a system where there is extremely low write load, the tomb raider process can be seen running, however, tombstones are not removed from the disk or index. Log entries show that the tomb raider process is running as follows:

Jun 24 2019 12:54:09 GMT+0700: INFO (info): (ticker.c:420) {tst} tombstones: all 2513747 master 1273883 prole 1239864 non-replica 0
Jun 24 2019 12:54:10 GMT+0700: INFO (drv_ssd): (drv_ssd_ee.c:1158) {tst} tomb raider start - marking cenotaphs ...
Jun 24 2019 12:54:10 GMT+0700: INFO (drv_ssd): (drv_ssd_ee.c:1178) {tst} tomb raider detecting cenotaphs ...
Jun 24 2019 12:54:10 GMT+0700: INFO (drv_ssd): (drv_ssd_ee.c:958) {tst} tomb raider reading /opt/tpa/datastore/tst/tst.data ...
Jun 24 2019 12:54:19 GMT+0700: INFO (info): (ticker.c:420) {tst} tombstones: all 2513747 master 1273883 prole 1239864 non-replica 0

However, consulting logs later on indicates that the number of tombstones present on the system has not changed.

Jun 24 2019 12:57:09 GMT+0700: INFO (info): (ticker.c:420) {tst} tombstones: all 2513747 master 1273883 prole 1239864 non-replica 0
Jun 24 2019 12:57:19 GMT+0700: INFO (info): (ticker.c:420) {tst} tombstones: all 2513747 master 1273883 prole 1239864 non-replica 0

Why are tombstones not being cleaned up?

Explanation

The answer lies not in the settings of the tomb raider process but in the overall lack of write load on the system. As there is nothing happening in terms of load (writes, deletes and updates), no blocks are falling below the defrag-lwm-pct and so nothing is being sent to be defragmented. This means that deleted records are sitting in blocks alongside active records. In a system under more typical load these blocks would fall below the defrag-lwm-pct in time and would be sent for defragmentation which would then allow for those older records to be overwritten. When load is low this does not happen.

The tomb raider process that removes redundant tombstones works by scanning the index for tombstones which are marked as cenotaphs (candidates for removal). The disk is then scanned, if a block is found containing a record that corresponds to the tombstone, the cenotaph is unmarked and will not be removed. At the end of this process any tombstones where no corresponding record is found (cenotaphs) are removed (assuming the eligible age has passed and there are no ongoing migration). It logically follows that when deleted records remain on disk, due to system load running slowly with a commensurate slow rate of defragmentation, the number of tombstones will not drop.

Solution

There is no need to solve this, behaviour is as expected. When the system load increases blocks will start falling below the occupancy level specified by defrag-lwm-pct and will be sent for defragmentation. As this happens the tomb raider process will take care of tombstones that are no longer required and the number will drop. Tombstones cannot be removed while the records they are marking remain on disk as, if this were to happen, the record could return to the database on a cold start.

Notes

Keywords

TOMBSTONE TOMB RAIDER DURABLE DELETE

Timestamp

25 June 2019

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.