FAQ What are Expiration, Eviction and Stop-Writes in Aerospike
There are a number of functions within Aerospike to protect the functioning of a cluster, either in day to day usage or under high load. It is important to understand what these are and when they are triggered. The three functions discussed here are expiration, eviction and stop-writes.
Expiration is a normal process within Aerospike when TTL has been set to a positive integer (records with no TTL or TTL of 0 or -1 are not included in expiration or eviction). Unless a client sends an override, the TTL used for a write will be the default TTL for the namespace. When a record is written the server takes the TTL (either the default or that specified by the client) and adds it to the current time to get the expiration time. When a record hits its expiration time the Namespace Supervisor will expire it, meaning it is removed from the cluster. An update or a touch from a client will reset the expiration time as if it were a new write. Reads will not affect expiration time. The namespace supervisor (NSUP) thread manages both evictions and expirations.
NSUP does not run all the time, instead it runs periodically. The time interval at which NSUP wakes up is the nsup-period
It is possible to throttle NSUP by altering the time interval for which it sleeps between generating delete transactions. The parameter used to do this is nsup-delete-sleep. In configurations with multiple namespaces and with large number of records, it is actually not uncommon for nsup to take longer then the nsup period to complete a cycle, especially if expirations, evictions or set-deletes are active.
It may be useful to throttle NSUP so that transaction queues are not overloaded with deletes generated by NSUP.
Special case with touch operation.
A touch operation will typically update / reset the TTL of the record. As of version 3.10.1, passing -2 as TTL for a touch or update transaction will leave the TTL unchanged. Here is one example to illustrate a touch operation where the client does not specify a TTL:
Let’s assume the default-ttl in the server config is set to 30D for the namespace.
Client writes a record with a 10D TTL.
5 days later, the client does a touch operation without setting a TTL.
The record will now have a TTL of 30D which is the default-ttl on the server side will expire 30 days from this point.
Eviction is when records are removed before their expiration time. Only records where TTL is set to a positive integer will be affected by evictions. Eviction starts when a High Water Mark (either disk or memory) is breached. It is particularly important to understand the role of bucket width in eviction as this explain which data will be evicted first.
Eviction and expiration are discussed in detail here.
A database hits stop-writes=true when either of the following conditions are met:
- Stop-write-pct is exceeded. This refers to RAM and stops the system running out of memory. This parameter is active even when SSDs are used as Aerospike always uses RAM for indexes.
- min-avail-pct relates to the minimum available amount of space on SSDs and refers to the minimum free blocks available for writing across all SSDs associated with a given namespace. This should never be lowered to less than 5%.
As the name suggests stop-writes=true indicates that the database will be in a read only mode until both of the conditions listed above are false. Though the database will not accept writes, deletes will still be processed. Replica writes as well as migrations related writes will also be processed.
It follows from the above that if disk HWM is set sufficiently high then a database could hit stop-writes without evictions being triggered. This may be appropriate for a given use case however the interaction of when eviction and stop-writes are triggered should be considered carefully. Both of the parameters listed above can be changed dynamically.
Stop-writes are discussed in detail at the following link.
- Expirations are a normal part of Aerospike operation.
- Evictions are a way of managing database behaviour when nearing maximum capacity.
- Stop-writes are another tool for managing behaviour near maximum capacity.
- Only records where TTL is set to a positive integer can be evicted.
- It is possible for a database to be at stop-writes=true without evictions being triggered.
- When databases start evicting and going into stop-writes can be controlled and should be implied from the user use case.