Cluster rebuild based on EBS snapshots

Hi !

I’m trying to create a scenario when I have to fully recover from cluster crash. We have 4 nodes with shadows on EBS. We are creating snapshots daily.

So my scenario is like this:

  1. I’m creating completely new cluster.
  2. I’m attaching to those new nodes EBS volumes (as shadows) created from snapshots.
  3. Starting aerospike

Based on previous testing single node (stop aerospike, overwrite local device with zeros and then start aerospike server with attached shadow) I was expecting around 30 minutes to get fully operational.

I was expecting the same amount of time to recover full cluster but instead based on the data copy time it’s more about 10 hours.

I’ve tested this 3 times with destroying and creating clusters. Any ideas/thoughts on that ? We’re using enterprise edition and I can fill support ticket if needed.

Sounds like its probably limited by the speed of the EBS volume. What does iostat -xmty 10 /dev/nvmeblah look like against the EBS volume while this is happening? The enterprise support folks are awesome though I’d go straight to them personally :slight_smile:

This topic was automatically closed 6 days after the last reply. New replies are no longer allowed.