Hi, We have a new cluster installed in AWS and are seeing some issues.
$ asmonitor -e "latency -k reads"
3 hosts in cluster: ***
====reads====
timespan ops/sec >1ms >8ms >64ms
***:3000 20:22:38-GMT->20:22:48 217.4 90.85 88.22 66.24
***:3000 20:22:45-GMT->20:22:55 192.8 2.96 0.00 0.00
***:3000 20:22:45-GMT->20:22:55 221.5 67.36 62.75 16.79
I’ve been troubleshooting this for a while and noticed that on this page it states:
If the write counter (“wr”) is not going down, it means requests are getting backed up. check for growth pattern
So, I checked and this is what I’m seeing on one of the “bad” nodes:
$ sudo tail -200f /var/log/aerospike/aerospike.log |grep trans_in_progress
Oct 07 2014 20:17:38 GMT: INFO (info): (thr_info.c::4416) trans_in_progress: wr 14 prox 0 wait 0 ::: q 17 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (52, 8894, 8842) : hb 3 : fab 44
Oct 07 2014 20:17:48 GMT: INFO (info): (thr_info.c::4416) trans_in_progress: wr 16 prox 0 wait 0 ::: q 34 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (52, 8894, 8842) : hb 3 : fab 44
Oct 07 2014 20:17:58 GMT: INFO (info): (thr_info.c::4416) trans_in_progress: wr 14 prox 0 wait 0 ::: q 27 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (52, 8894, 8842) : hb 3 : fab 44
Oct 07 2014 20:18:08 GMT: INFO (info): (thr_info.c::4416) trans_in_progress: wr 15 prox 0 wait 0 ::: q 31 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (52, 8898, 8846) : hb 3 : fab 44
And I’m seeing this on the “good” nodes:
$ sudo tail -200f /var/log/aerospike/aerospike.log |grep trans_in_progress
Oct 07 2014 19:59:45 GMT: INFO (info): (thr_info.c::4416) trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (15, 8480, 8465) : hb 3 : fab 44
Oct 07 2014 19:59:55 GMT: INFO (info): (thr_info.c::4416) trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (15, 8480, 8465) : hb 3 : fab 44
Oct 07 2014 20:00:05 GMT: INFO (info): (thr_info.c::4416) trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (15, 8480, 8465) : hb 3 : fab 44
Oct 07 2014 20:00:15 GMT: INFO (info): (thr_info.c::4416) trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (15, 8480, 8465) : hb 3 : fab 44
What’s interesting is that I’m not even doing any writes or deletes. I’m only doing reads. I’m not sure where to go from here with the troubleshooting. Any help would be appreciated.
Thanks, Ben