In my development environment that use EC2 ephemeral disk+EBS software RAID1, I often get com.aerospike.client.AerospikeException: Error Code 18: Device overload . afaik, it is a server side status not specific to Java client. What exactly does it mean? how to troubleshoot a “device overload”. The server looks normal to me.
btw, if the hardware is not fast enough, I’m expecting a increase in latency rather than cause exception.
Aerospike does not synchronously flush swb (storage write blocks) to disk. These blocks will be flushed when full (or based on some other tuning parameter, but let’s keep those aside for now). Write transactions do get committed to memory on master and replica(s) before returning to clients though.
When a storage device is not keeping up, Aerospike uses cache configured through (max-write-cache) and will try to keep up until a certain point (when this cache is full) and will then throw those device overload error. Therefore you may not see as much direct latency impact.
You can dynamically increase this cache from the default (64M) to a higher multiple of the write-block-size (which is by default 128KB for SSD devices).
thank for your info. i’ve been testing for the past two days to do bulk data loading programmatically. My original instance is not optimized for IO and perform really bad, and I switched to another instance that improves a lot, but I can still easily overload the device when I start more threads to write data concurrently.
It seems the max-write-cache is useful for handling burst traffic, but as long as number write request is sustainably more than “hardware” throughput, the “w-q” will reach the max over time.
WARNING (drv_ssd): (drv_ssd.c::4260) {NAMESPACE} write fail: queue too deep: q 513, max 512 WARNING (drv_ssd): (drv_ssd.c::4260) {NAMESPACE} write fail: queue too deep: q 10242, max 10240
So far, for a single AS instance, I’m able to get a sustainable throughput for a few hours of disk write at 26-28MB/s from iotop ( in CloudWatch, it is slightly higher than 60MB/s) or 500 tps in a EC2 M3.XLarge EBS-Optimized instance with a 40GB/1200 IOPS EBS. (without any software RAID) The instance in theory supports 62.5MB/s disk write. My Java program just read from an old AS instance and load data to this new instance, when it use more than 1 thread to write, it basically always overload the device.
I suppose any database has a limit. One difference in compare to traditional RDBMS (or actually some other NoSQL) Aerospike doesn’t use a on-disk journal/write-ahead-log to accept write request. so the app design should detect an overload or maybe wait progressively between retries when an “device overload” error is encountered. I wonder what is the best practice to avoid “device overload” while trying to optimize bulk requests.
w-q is clear but swb-free is a little bit confusing to me. When it is about to overload, w-q + swb-free <= max-write-cache/block size)`. But when it doesn't reach the peak loading, swb-freeseems to have not allocated and remain at a low level. I somehow expectswb-free`` at a high level even when it is idle.
We are working on two prong approach to this problem in cloud, specifically AWS.
is to add a specific stat for the write queue which will help in better monitoring and pacing the writes as per the device capability.
is a new solution using ephemeral and EBS devices which should be better than RAID performance. We expect to release the same within few weeks. Right now its in final phases of docs preparation. Will let you know once its ready.
I will update more on this topic with respect to our AWS benchmarking experience in EBS and ephemerals and how to best work around this in some time.
I’m evaluating Aerospike for use, and we’re pre-populating it with data under a write heavy load. All works well at first, with a steady write throughput of about 3000 TPS (AMC) and about 70M/s (iotop). However, when we reach about 300K replicated objects in the namespace it starts pushing back with overload and writes plummet to about 6M/s and 200 TPS.
Tried a bunch of different EC2 types, and they all respond in the same way. We are using ESB SSD raw device.
We tried this and it’s definitely more stable. There’s no noticeable spike and crash anymore, however the write throughput seems to peak at around 1000 TPS per node. We’ve followed all the recommendations laid out in documentation (we can’t use a VPC yet due to other factors but network performance doesn’t seem to be a major issue).
We’re significantly short of the 10K TPS write performance and not sure what to do about it.
We’re using i2.4xlarge instances. Anything to do beyond what’s recommended in the article?
Are you using HVM based images or PV images? Would suggest using HVM images.
There’s no noticeable spike and crash anymore, however the write
throughput seems to peak at around 1000 TPS per node
What is the object size you are using? What is the workload that you are using testing?
We have not tested EC2 Classic setup for performance. Is it possible for you to share a collectinfo dump of your server node while a load is running? It would help us to debug your environment and help us to set a reproduction environment as required.
I would suggest using the latest aerospike-admin from collectinfo2 branch over here…
You would then run it by
sudo python asadm.py -e collectinfo
The data collected is visible in the log, so you could review it before sending to us.
I am facing issue of AEOSPIKE_ERR_DEVICE_OVERLOAD. I have only one node in cluster and using C client benchmark.
In configuration, am using raw SSD device and write-block-size is set to 128k (default) and max-write-cache to 128M.
To run C client benchmark, following cnfiguration is used - ./target/benchmark -h 127.0.0.1 -p 3000 -n test -k 1000000 -b 1 -o S:4096 -w I -z 8. But it gives the error of device overload and after checking aerospike logs gets messages -
WARNING (drv_ssd): (drv_ssd.c::4260) {NAMESPACE} write fail: queue too deep: q 513, max 512
is there any configuration parameter am I missing?
I wanted to verify some perforamnce numbers with Aerospike, is there any other way to do it?
It means that your drive is not keeping up with your load. Different SSDs have different capacity, and you’ve found yours. Read the capacity planning section, especially the parts about drives. If you can’t find your drive in the list published, you’ll need to run your own ACT test on it.
Just like an update, I resolved this problem using 1M in the write-block-size property instead of 128k. I am using different SSDs. My aerospike server is able to handle 10x more transactions than before.
Would be interesting to know if you would have the same result by keeping the write-block-size to 128KiB but increasing the post-write-queue to 2048 (8 x default 256).