Accumulated requests on client api

Original Postby artursocha » Fri Mar 28, 2014 8:44 am

Hi, I was testing out application today. We have 2x node setup with 2x replication and ‘return on master write’ enabled. There were 5k-20k(peaks) tps of async writes(puts,deletes) in total seen on aerospike dashboard. 90% of writes took <8ms (because of 1-2 secondary indexes per set) async client policy setup: asyncMaxThreads=16 asyncMaxSelectorThreads=4 asyncSelectorTimeout=0 asyncMaxCommands=1000 asyncMaxCommandAction=BLOCK

At the beginning it worked fine but then write operations started to accumulate and were eventually blocked (as per asyncMaxCommandAction). I suspect that the problem lies on server side. I would appreciate any hints what config options to check first.

Some of server config that might be relevant: transaction-queues 16 transaction-threads-per-queue 8 proto-fd-max 15000 migrate-xmit-hwm 200 migrate-xmit-lwm 75 migrate-threads 50

namespace: replication-factor 2 memory-size 100G default-ttl 0 # never - it will be handled per record high-water-memory-pct 70 # evict data if memory utilization is greater than 60% high-water-disk-pct 60 # evict data if disk utilization is greater than 50%. These partitions must not be mounted by the file system.

storage-engine device { device /dev/sdb scheduler-mode noop write-block-size 128K data-in-memory true }

Most settings just have default values.

thanks, Artur artursocha

Postby young » Wed Apr 23, 2014 3:30 pm

Sorry this took some time for a response. Generally we find that the number of threads and queues are optimized differently for RAM and SSD namespaces:

for RAM transaction-queues 4 transaction-threads-per-queue 4

for SSD (your case) transaction-queues 8 transaction-threads-per-queue 8

You may want to set the migration threads back to the defaults (or simply delete) to start the tests. 50 for migrate-threads are very likely to be too high. young