Hi there, I’m running Aerospike 3.7.4.1 and am having some trouble tuning my read and write workloads to play nicely with each other. The cluster consists of 6 machines on Google Compute Engine with 2x NVMe SSDs apiece.
Under normal circumstances the cluster is able to service about 15K TPS of writes. I got this statistic by checking stat_write_reqs
over time. We have a certain batch process that runs that causes a high write volume for a short amount of time. Holistically, I believe that the writes generated by this process affect more bins than the normal writes but I can’t confirm this. (Is there a histogram measuring the size of each write being serviced at any given point in time?)
During the batch process, the write TPS goes up to about 25K (see [1] top-left quadrant) and suddenly requests for reads begin to get queued up (see [1], bottom-right quadrant, a graph of the batch_queue metric as we only use BatchGet() for reads.) At the exact same time we observed a bump in read duration and a big increase in the ‘await’ I/O stat (see [2]) that were strongly correlated.
So the hypothesis is this: During the write storm, I/Os for the writes unfairly beat out queued read operations and we thus have operational delays in read operations that occur during the write storm. Therefore we would like to find a way to favor reads over writes at all times so that a write storm cannot impact the duration of a read. How can we rate limit total write I/O to this end?
My initial path forward was that I noticed the ‘transaction-queues’ parameter which is recommended to be set to the number of cores on the machines. Each of our 6 cluster members has 8 cores, so I set this to 8 and did an A/B test of members. It doesn’t seem to make much of a difference. So that’s why I’m putting this post out – is there another well defined way to do this?
PS. We also tried submitting write requests with MEDIUM priority and read requests with HIGH. This didn’t seem to do much. I think that queueing is not so much a problem as I/O volume on the SSDs. If we can defer writes for the sake of keeping reads performant, that’s what we want. We don’t care if writes take a long time to turn around.
Thanks in advance for any advice that can be provided here.
[1]
[2]