How do I speed up cold start?

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

Problem Description

It takes a long time for a cold-restart of Aerospike service on my deployment. Do I have any options to speed this up?

Explanation

Aerospike cold restart scans all records on the persisted storage layer and rebuilds the primary index (and data if data-in-memory is configured true) and then any potential secondary indexes into memory. This process can take a long time depending on the number of records, disk I/O capability or even CPU capacity.

Read more about Cold start on this documentation page:

Solution

The asmt tool can be leveraged. Refer to the FAQ - ASMT Tool article for details.

The rest of this article assumes the primary index will be rebuilt as part of the cold restart. From that perspective, Aerospike does not have a specific configuration to speed up cold-start. It is recommended to identify where the bottleneck is. A slow cold restarts doesn’t necessarily mean that disk utilization / CPU utilization are at 100%.

If you suspect secondary index building is a bottleneck refer to this Knowledge Base article:

Aerospike will read the disk as efficiently as possible. Our tests have shown that 1 thread per device partition is usually optimal. Having multiple devices (partitions) / files could in some cases help.

Related Knowledge Base article on abrupt jump in percentage completed during cold-start:

Landing into cold-start eviction (breaching high-water-marks during the cold-start) can also slow down the overall process. Refer to this KB article to understand the details: FAQ - What options are available to speed up cold start eviction

Note: When planning to use smaller disk partitions in parallel for your namespace storage, keep in mind that having too small partitions or files may prevent defragmentation which may lead to low avail percent. This could occur if the post write queue is greater or close to the size of the partition itself.

Keywords

COLD START SPEED FAST PERFORMANCE PRIMARY SECONDARY INDEX

Timestamp

May 2021