How does Aerospike protect my data in case of a SSD failure?

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

Summary

What should I do when one of namespace SSDs fails? Does Aerospike copy the data across SSDs associated with a namespace? How do I replace failed SSD and restore the data?

Resolution

The data recovery will depend on your configuration. Aerospike does allow the possibility to have an unreplicated namespace. But normally, you should have your replication factor set to at least 2 (which means that you will have two copies of your data at any time in the cluster assuming everything is balanced). With replication factor 2, the cluster will automatically copy the data between the remaining nodes. No commands are necessary to rebalance the data.

Keeping the above in mind, let’s say that you have 3 SSDs on your nodes for 1 namespace and one SSD fails. You will need to physically replace only the failed SSD. You should be able to stop the Aerospike server, hot swap the SSD (if it’s hot-swappable) that failed and bring back the Aerospike server back up. You would not need to clear all the SSD’s by re-initializing the drive.

For Aerospike Enterprise Edition, nodes have the added feature of going through a fast restart which would still work in case of a SSD replacement after a failure. In case of replacing a disk, records from that failed drive will be cleaned up from the index upon fast start and repopulated through migrations. So, all of the data will be restored once migration is completed. For Community Editions, you would go through a cold start to restart the server with the replaced SSD.

Instructions on adding or upgrading a device is explained in the following documentation:

For more information on SSD Initialization, please see below: https://www.aerospike.com/docs/operations/plan/ssd/ssd_init.html

1 Like