I am running Aerospike on EC2. I used the following link http://www.aerospike.com/docs/deploy_guides/aws/install/
I picked an i2.4xlarge. I am using the 4x800 SSD as ephemeral SSD Instance stores (NOT EBS backed) because I am assuming read and write will be faster.
After some writes and reads I did a reboot of the machine and to my surprise the data were still there.
However I was under the impression the data on the ephemeral disk would get lost when stopping or rebooting the instance.
You are not guaranteed to get the same instance back. You are probably more likely to keep your data if the instance was only briefly down but I wouldn’t depend on this. But if you were to stop the instance for a week then the data would probably be gone.
case a. I would not use EBS at all. I would use the ephemeral Instance Stores. I would spawn multiple machines with replication factor to protect against a machine going down and data loss.
case c. Turn the Instance Stores into EBS. Data would be persisted so I wouldn’t have to have a replication factor if i can afford a down time. But I am concerned about read latency.
Please could you provide some thoughts on case a. vs. case b. vs. case c. ?
You do run the risk of losing data if multiple nodes fail. This exists on bare-metal but on bare-metal the are more situations where the data can still be recovered but would be lost on ephemeral.
Yes, bcache as been problematic for us. Aerospike 3.5.10 I believe is expected to be out next week will include a new feature for defining a shadow disk. Basically writes will go to the primary disk and the shadow disk, reads will only go to the primary. So stay tuned .
Depends on how often EBS volumes fail. Also means that if a node were to fail, your cluster will be missing data till you are able to replace it with another node with the appropriate EBS volume.
Hardware doesn’t reset unless the instance is stopped or terminated, a reboot is considered a recycle at the VM level so the underlying hardware remains.
Depending on your load, EBS can have severe performance issues.
If you want to use EBS for durability, your instances should be the largest available, with the fastest network performance, EBS Optimized enabled and Provisioned IOPS to guarantee baseline performance. You can do RAID striping across several volumes or use several devices under a single namespace to distribute the load some more.
In almost every situation, if you’re using high-end instances, the local instance storage will outperform by a wide margin. The most common use case with distributed databases is to rely on the distributed nature of replication to shield against data loss of a single instance.
More database nodes will give you high availability and better performance as you scale with more clients and transactions. You can always increase the replication factor to deal with multiple node failures. Multiple nodes are also the only way to get safety of in-memory data if you have it.
We did not release 3.5.10 as @kporter alluded to, but we did release server release 3.5.12 on May 28, which adds device shadowing functionality for persistence on network devices in addition to ephemeral storage (KVS- AER-3557). The full release notes for the 3.5.12 Aerospike server community edition are here.