The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.
How To Store Data In Memory Only and Warm Start
Context
When using storage engine memory, data is only in memory. Stopping Aerospike normally with data only in memory results in loss of data on a node because data is not stored in shared memory. When a node is started back up, it starts empty and must receive data through replication.
This article explains how to store data only in memory, while retaining the ability to perform warm restarts and not losing data on Aerospike restart. Note that on server or OS restart (for example power loss), data will still be lost on the node.
This option may be used when there is enough RAM available to store the records with additional overheads, where the following requirements exist:
- storage of data only in memory
- fast/warm restarts are required (e.g. when upgrading Aerospike binaries)
- reduced migration time (and network load) on fast restart
Advantages when compared to storage-engine memory
:
- can perform warm restart - data is stored in shared memory
- reduces memory fragmentation - this is handled by the virtual tmpfs filesystem and Aerospike’s defrag
- on warm restart, reduces network load with replication factor of 2 or more caused by migrations
Disadvantages when compared to storage-engine memory
:
- memory storage has space optimisations, as set and bin names are not stored with each record. These are stored once and referenced in the records in memory. When using tmpfs, Aerospike treats this memory as a device, causing more utilisation per record (as much as it would occupy when using disk-backed storage)
- as defrag also runs, this can potentially cause more CPU load
- frequent record updates could also be less performant as Aerospike would rewrite the whole record as done on a storage device as opposed to direct updates
Method
Create a directory to store data in:
mkdir /mnt/aerospikeramdisk
Mount the directory, giving Aerospike RAM data disk 4.2GB
mount -t tmpfs -o size=4300m tmpfs /mnt/aerospikeramdisk
Use storage-engine device with file /mnt/aerospikeramdisk/bar.dat
in Aerospike configuration. This makes Aerospike write to /mnt/aerospikeramdisk/bar.dat
file, which is stored in RAM only.
...
storage-engine device {
file /mnt/aerospikeramdisk/bar.dat
filesize 4G
data-in-memory false # no need, the file is already in RAM
read-page-cache false # no need, data is already in RAM
}
...
Now start Aerospike.
Add auto mount option to fstab to ensure RAM disk is created on reboot:
cat <<'EOF > /etc/fstab
tmpfs /mnt/aerospikeramdisk tmpfs nodev,nosuid,noexec,nodiratime,size=4300M 0 0
EOF
Notes
Do not use read-page-cache
when doing this, it will not help read speeds at all. Data is already in RAM.
When creating the RAM disk, you must account for space for tmpfs
filesystem overhead in your required space calculation. This is why, in the example above, the tmpfs
mount reserves 4.2G, while Aerospike is only configured to use 4G.
When using secondary indexes, Aerospike must load those on restart, which may take a while. With data in memory, the time it takes should be greatly reduced.
Applies To
Server 4.3.1 and later.
Keywords
DATA-IN-MEMORY STORAGE-ENGINE MEMORY WARM START
Timestamp
September 2021