I am facing “AEROSPIKE_ERR_DEVICE_OVERLOAD” during writes (using pipeline write) and wondering in case there is any C client API through which client can periodically pull “write-q” (un-flushed write buffers) information from server to determine that device writes are not scaling. This is mainly to implement a back pressure mechanism in the client code to slow down things once the server buffers start approaching the limit. I am using server version 4.3.8.
I have though about back pressure too, but finally decided that it is not as simple to implement as it seems to be, because
info protocol as @Albot suggested will return write-q value on the per-device basis
single server may contain multiple devices
there are multiple servers in the cluster
So imagine that there is just a single slow disk in the cluster of multiple machines and its write-q increases. How to understand whether to stop writes or to continue if you don’t exactly know which device will be reached by exactly this record?
Moreover in case of info protocol there is no guarantee that you don’t get device overload error between two info requests.
What I’d like to understand in my question here is whether there are any possibilities of loses in case when client writes a lot of data with non-blocking API, fills up write cache (max-write-cache config option), gets an error, then sleeps for some period of time and then retries all the requests which led to device overload errors previously?
Your client write is not written to the write-block-buffer if the write-q is full. The first thing that is checked once a transaction is determined to be a write transaction is whether the device it is destined for, which is deterministic, is experiencing a write-q full. You can validate by inspecting the CE code.
@Albot Yes, retry with delay will be there in case feedback is not working as expected. We are trying to build some proactive mechanism so that client can slow down during pressure situation and restore back once server is settle.
@szhem We will be mostly working with homogeneous set of disks and machines so hoping that they will be operating on same performance level. But you are right, there are variations like hot spot on a disk or background de-fragmentor which can consume certain disks bandwidth disproportionately. We probably need to do more math to first figure out any laggard disk and then calculate overall cluster throughout assuming that all other disks will be operating on same (laggard) disk speed. With this assumption, better disks will be under utilized.