But from documentation it is not clear how to migrate to this setup from previous bcache setup. Should we just use the same EBS from bcache setup as shadow device and Aerospike will find data on it or do something else to move data?
Other question is that I assume at start time Aerospike always copy data from shadow device to primary, is it true?
Yes, haven’t created such a document yet. I suspect it would be something along the lines of:
Know that this procedure hasn’t been verified.
For each node:
Stop Aerospike
dd if=/dev/BCACHE_DEVICE of=/dev/EBS_DEVICE
Remove bcache configuration
Zeroize local ssd: dd if /dev/zero of /dev/LOCAL_SSD
Reconfigure Aerospike to use device /dev/LOCAL_SSD /dev/EBS_DEVICE
Start Aerospike and local device will be synced with shadow device.
On warm-start (Enterprise Only) It only copies data from the shadow device if the local device doesn’t contain an Aerospike header. Otherwise it will assume the disks are in sync. On cold-start it will always copy from the shadow device.
Curious to know the reasoning behind zeroing the local disk. By not doing this, would it not save the need of copying data back to local disk from the EBS device on the startup (basically the step 7)?
Also, what should be the expected behaviour if the local disk from local disk is neither zeroed and also not copied to EBS device? Will it start as a clean/new node and data will eventually get copied from the other nodes (assuming replication factor >= 2)?
BTW, on CE it will always copy the data from the shadow device to the local SSD. I updated that part of my response.
Does it check for the presence of aerospike header in the shadow device before copying it to the local device on startup? Otherwise, if a new node is added (both local and shadow device as blank), even then the data will get copied from shadow to local device? If so, it would slow down the startup process significantly.
I ran into one more issue while adding the shadow disk. In the test setup, I have a cluster of 3 nodes(Aerospike Community Edition build 3.9.0.3), with each node having local SSD as the primary device. The cluster has one namespace with approx. 28M records. Starting with the 1st node, I added the blank persistent SSD as the shadow disk to the node & the aerospike configuration and restarted aerospike. However, the startup got stuck for hours. There is no error as such in the logs. All the resources (cpu, disk and network) usage is also almost nil. After a few hours, the startup got completed and the node joined the cluster again. Afterwards, I repeated the same on other nodes and the behavior is not consistent. Some nodes were able to start in around a minute as expected, whereas others took few hours. I each tried adding a fresh node with a new blank shadow disk to the existing cluster and it’s startup also took few hours.
Some of the nodes may (in the past) used more blocks of storage - this will make startup take a bit longer. Also it may have been that these nodes had to do eviction during the startup process which can make startup much faster.
For simple restarts like this, Aerospike Enterprise has ‘Fast Restart’ which reduces multi-hour cold-starts to a few seconds.
I doubt if eviction or used blocks could be the reason as it happened on a fresh node as well (blank local SSD + blank shadow device). I even enabled the logs in debug more but didn’t find anything unusual. Is there any way to know what aerospike process is doing at any moment, something like thread-dump for JVMs?
I had also attached the logs in the last comment. It would be great if you could take a look.
Can’t determine anything from the logs. Paxos was starting which occurs after drive loads. This is a really old version of Aerospike, the these code paths have changed quite a bit. The issue seems to have resolved itself for now, I’d be happy to look further if it occurs in our latest builds.