How To Store Data In Memory Only and Warm Start

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

How To Store Data In Memory Only and Warm Start

Context

When using storage engine memory, data is only in memory. Stopping Aerospike normally with data only in memory results in loss of data on a node because data is not stored in shared memory. When a node is started back up, it starts empty and must receive data through replication.

This article explains how to store data only in memory, while retaining the ability to perform warm restarts and not losing data on Aerospike restart. Note that on server or OS restart (for example power loss), data will still be lost on the node.

This option may be used when there is enough RAM available to store the records with additional overheads, where the following requirements exist:

  • storage of data only in memory
  • fast/warm restarts are required (e.g. when upgrading Aerospike binaries)
  • reduced migration time (and network load) on fast restart

Advantages when compared to storage-engine memory:

  • can perform warm restart - data is stored in shared memory
  • reduces memory fragmentation - this is handled by the virtual tmpfs filesystem and Aerospike’s defrag
  • on warm restart, reduces network load with replication factor of 2 or more caused by migrations

Disadvantages when compared to storage-engine memory:

  • memory storage has space optimisations, as set and bin names are not stored with each record. These are stored once and referenced in the records in memory. When using tmpfs, Aerospike treats this memory as a device, causing more utilisation per record (as much as it would occupy when using disk-backed storage)
  • as defrag also runs, this can potentially cause more CPU load
  • frequent record updates could also be less performant as Aerospike would rewrite the whole record as done on a storage device as opposed to direct updates

Method

Create a directory to store data in:

mkdir /mnt/aerospikeramdisk

Mount the directory, giving Aerospike RAM data disk 4.2GB

mount -t tmpfs -o size=4300m tmpfs /mnt/aerospikeramdisk

Use storage-engine device with file /mnt/aerospikeramdisk/bar.dat in Aerospike configuration. This makes Aerospike write to /mnt/aerospikeramdisk/bar.dat file, which is stored in RAM only.

...
storage-engine device {
        file /mnt/aerospikeramdisk/bar.dat
        filesize 4G
        data-in-memory false # no need, the file is already in RAM
        read-page-cache false # no need, data is already in RAM
}
...

Now start Aerospike.

Add auto mount option to fstab to ensure RAM disk is created on reboot:

cat <<'EOF > /etc/fstab
tmpfs       /mnt/aerospikeramdisk tmpfs   nodev,nosuid,noexec,nodiratime,size=4300M   0 0
EOF

Notes

Do not use read-page-cache when doing this, it will not help read speeds at all. Data is already in RAM.

When creating the RAM disk, you must account for space for tmpfs filesystem overhead in your required space calculation. This is why, in the example above, the tmpfs mount reserves 4.2G, while Aerospike is only configured to use 4G.

When using secondary indexes, Aerospike must load those on restart, which may take a while. With data in memory, the time it takes should be greatly reduced.

Applies To

Server 4.3.1 and later.

Keywords

DATA-IN-MEMORY STORAGE-ENGINE MEMORY WARM START

Timestamp

September 2021