I was researching the impact of list operations—such as updating, appending, and removing elements—but couldn’t reach a definitive conclusion.
Does appending to an existing list trigger a full list rewrite?
From my understanding, Aerospike reads the list from SSD, appends the new element, and writes the updated list to a memory block, which eventually gets flushed to disk. The older data block will be retrieved when defragmentation happens. Is this correct?
Does removing an element by index follow the same process?
If a list element is a JSON object, does updating a field within that JSON trigger a full rewrite of the entire list?
Below is a sample code snippet I am using. All operations leverage ListOperation:
aerospikeClient.operate(
null,
new Key(aerospikeNamespace, SET_NAME, id),
ListOperation.append(BIN_NAME, com.aerospike.client.Value.get(value)));
Aerospike is always a “copy-on-write” system, so even a simple change will cause the whole record to be read from storage, updated in memory, then written back to storage in a new location. So yes, you point (1) is correct, and the same process occurs on ANY record modification, including your scenarios (2) and (3).
The CDT operations like the one you showed allow minimizing network traffic between the client and server, but do not reduce the load on the drive at all. This is typically only an issue for large lists or maps, and patterns like the Adaptive Map can help break up these large maps.
It seems that Adaptive Map is a better fit for our use case. However, I noticed that it hasn’t been commercialized yet. I wanted to check if we have a list of known bugs for it. Also, do we have any detailed documentation available? The README file doesn’t seem to cover everything, as I noticed while reviewing the code that some methods perform multiple reads based on certain conditions. If there’s any comprehensive documentation or a video explaining its internal workings, that would be helpful.
Yes, it has not been commercialized, support is on a community basis as time permits. However, people are using it in production (see this video for example). If you find any bugs or desired enhancements with it, please raise them as github issues on the repository. The developer of the library is still with Aerospike and active on our forums (obviously… )
The codebase is pretty old now, it still uses PredExps instead of Expressions for example. With the release of version 8 of Aerospike there is now support for ACID transactions, and code would benefit greatly from using these in some places as it effectively implements it’s own locking system to split records.
One drawback of this is it would require a commerical license for Aerospike and preclude using CE (free) version. I’m curious which version of Aerospike you’re using?
You are correct – in some cases there are multiple reads. This happens only when the root block has split, in which case there is one read to get the split bitmap and then another to read the record which contains the map value you care about. The only documentation so far is in the README.md file and comments in the code. However, if you have questions I can probably answer them fairly quickly.