How to detect if shadow device needs to be zero-ed?

aws

#1

Hi,

I’m setting up an Aerospike cluster on AWS on an r3 machine (which has local SSD). I’m using the local SSD as namespace device storage, and have an EBS volume of the same size as shadow device. The cluster can be created multiple times and on-demand by other developers with the use of terraform. I currently have ansible scripts to install and configure (networking, namespaces) aerospike on the machines. Basically to spawn a new cluster one runs terraform and then runs ansible, and that should do it.

Normally, the EBS volumes should be zero-ed (dd if=/dev/zero ...) before being used by aerospike. Is this also the case for shadow devices? And if so: how can I let a script recognize that the volume contains aerospike data or just random stuff? If I’m running the dd if=/dev/zero ... as part of the user data, it’s executed for every ‘new’ instance which includes the upgrade of instance type, reboot because of aws downtime, etc. I don’t want it to destroy any existing data. If there is some way to detect if the volume needs to be zero-ed, that would be very helpful.

Thanks


#2

If you use an EBS as a shadow, and the ephemeral as a primary, once you reboot, if there is no Aerospike data on the ephemeral it will read from the shadow device to rebuild the primary index (and load into DRAM of storage is in-memory).


#3

Hi @rbotzer, thanks for your reply.

I know that part already, that’s why I set up the shadow device. The question is about the zero-ing of the shadow device.

Apparently the ephemeral device doesn’t need zero-ing (it’s done by Amazon before the EC2 instance is started). When using an EBS volume as primary device, it must be zero’d. Is this also required when the EBS volume is used as shadow device? And if yes: how to avoid zero-ing the EBS volume the second time you boot?


#4

The initial zeroing of the device is for pre-warming which is needed by EBS devices for performance.

This may not be necessary with new EBS volumes according to: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html


#5

I thought it was always required as documented per http://www.aerospike.com/docs/operations/plan/ssd/ssd_init.html

But apparently, since version 3.3.5, it’s not required anymore.

So since it’s now just needed for EBS initialization, using fio will solve my problem as it just reads the disk instead of writing zero’s to it.I’ll set that up to init the shadow disk on boot. Thanks!


#6

Actually, according to the Amazon docs, pre-warming is only required when creating a EBS volume from a snapshot. I think the ssd_init.html documentation should be updated to reflect that.


#7

AER-307 is unrelated to this. What is it saying is that it allowed you to replace a failed disk[s] without needing to zeroize any remaining disks belonging to that node.

We still recommend zeroing disks before use, the page you linked isn’t Amazon specific. Amazon’s disks I believe return zero for any unused blocks so it was only necessary here for performance. The documentation here: http://www.aerospike.com/docs/deploy_guides/aws/recommendations/#pre-warming-ebs-and-ephemeral-disks may need to be updated to reflect the new AWS recommendations. Before changing the documentation, we will first need to validate the claim by running benchmarks against zeroized vs non-zeroized EBS drives.