Handling "write fail: queue too deep / Error Code 18: Device overload" on the client side

I often face with the following issue write fail: queue too deep / Error Code 18: Device overload. According to the following article there is a workaround that suggests to increase max-write-cache value, but in my case when I have processes which load data into Aerospike quite a long, this parameter does not help a lot. Also SSD devices cannot be replaced for more powerful ones in a short period of time.

Because of that I’d like to understand whether there is an option to handle Error Code 18: Device overload errors on the client side correctly?

For example, if client catches this kind of exception it just retries the previous write operation with some kind of backoff later on.

What I’m worried about a lot is how not to lose the data in case of this kind of exceptions.

1 Like

The drives are unable to keep up with the load. Can you throttle the load?

1 Like

@kporter I can, but what I’m interested in is when do I have to throttle the load)