Aerospike have produced a tool, ASMT (Aerospike Shared Memory Tool), that allows the Aerospike primary index to be backed up on disk. This document describes its functionality and usage.

Why did Aerospike develop ASMT?

Aerospike developed ASMT to speed up graceful cold restarts when a restart of the underlying Linux instance is unavoidable. Each Aerospike namespace has a number of shared memory segments. When the Aerospike node shuts down, these segments remain in shared memory meaning that if the instance itself remains running and the asd process shutdown is clean, the Aerospike node fast restarts (also referred to as warm restarts). The advantage to a warm restart is primarily one of speed. The alternative is a cold start whereby the index is rebuilt by reading data back from disk. The key disadvantage to cold start is the time it takes. There are other issues that can manifest such as cold start refragmentation or resurrection of deleted records in a non-durable delete environment or, in a worst case scenario, cold start eviction. For these reasons it is always better to warm start when possible. There are circumstances where a warm start is not possible, for example, when the server itself is restarted and shared memory is lost, such as during OS/kernel patching.

How does ASMT work?

ASMT can work in either backup or restore mode. In backup mode, once Aerospike has been shutdown completely, ASMT writes a copy of the shared memory segments to disk. In a clean shutdown Aerospike tags the shared memory to let subsequent start ups know that the memory is safe to use. Without that tag ASMT will not backup shared memory segments as they may still be in use. Likewise, in restore mode, the datafiles previously written by ASMT are read and written back into shared memory which allows the Aerospike node to warm start.

Is it possible for ASMT to create a partial backup of a namespace?

No, ASMT is extremely conservative. Before copying any files to the filesystem it calculates the space required by the shared memory (base, tree and arenas) and it pre-allocates that space. If there is a failure mid way through the copying process, ASMT will remove all of the in progress files so that it is not possible to do an inadvertent partial restore. ASMT will also fail if it finds pre-existing files with the same name as the name it plans to use. ASMT does not remove any directories it creates if stopped during operation.

Is there any verification to confirm that ASMT has written files successfully?

Yes, ASMT can do a CRC check as a final step when writing files if the option -c is used. This would cause some overhead in the overall process unless the compression option is used, in which case there wouldn’t be any overhead.

Does ASMT backup all or some of the namespaces?

This is configurable, the default is that all namespaces are backed up. The -n option is used to select namespaces.

Can ASMT back up multiple Aerospike instances on the same server?

Yes, ASMT’s -i option will backup to or restore from shared memory of a specific instance number. This flag takes an integer for the instance number.

Can ASMT compress files?

Yes, the -z option selects compression. Compression level is not configurable. We have observed up to 30% compression in some of our tests. When using compression, adding the -c option for CRC check would not add any over head as it would be included as part of the compression.

Does ASMT have a ‘dry run’ mode?

Yes, the -a flag switches on analyse mode which runs like a backup or restore but does not read or write any files.

If records were deleted while Aerospike was down, will ASMT restore them?

Yes, ASMT will restore an image of the primary index prior to the server restart, if records were deleted in the interim these will return.

What return codes will ASMT return?

ASMT returns 0 for success and 1 for error.

Can ASMT provide diagnostic output information?

ASMT operates in a very conservative fashion. The -v verbose option provides more information about execution and even the shared memory segments it finds, including:

  • Size
  • Owner
  • Segment type – Tree, Arena
  • CRC32

What happens if ASMT is running and the Aerospike node is started?

The startup will fail. Likewise, ASMT will not start if the Aerospike node is running.

Does ASMT need to be run as root?

Yes, but this can be via sudo. Aerospike running as non-root can be done but ownership of shared memory segments must be changed manually after the restore has completed.

Can ASMT be scripted?

Yes but checks must be implemented to ensure that ASMT and the Aerospike node do not try and start while one another are running. As described above, one or other will fail to start in this scenario. Potentially more important is to have controls in place to ensure that on start up the Aerospike node only restores a current valid ASMT backup. If the backup of the previous index failed post shutdown, it is possible that a scripted ASMT assisted Aerospike warm start may restore a stale backup file leading to inconsistent data.

Can ASMT be used to move an Aerospike node to a new server?

Not really, the configuration and devices would have to be an exact match for the system to work correctly. The purpose of ASMT is not to clone or move servers it is to speed up cold starts when an instance or host restart is unavoidable.

Can ASMT be used in cloud environments?

Yes, as long as the storage attached to the cloud instance persists a restart. With AWS, for example, if EBS is the primary storage then ASMT can be used as normal. If ephemeral drives are used for primary storage then ASMT will work in the event of an instance restart (where the same ephemeral drives would be present when the server comes back up) but not if the instance has been stopped and restarted (where the instance and ephemeral drives may be different).





March 2021

© 2021 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.