How to simulate to see the values for client_read_error and client_write_error


#1

I’m testing the aerospike metrics and would like to see the non-zero values for client_read_error and/or client_write_error by any means. I tried Aerospike provided benchmark java client and I could see the errors for read/write in the cli with tweaking the tps and some other parameters, but those monitoring metrics just report zeroes.

The doc says the client_read_error indicates Number of client read transaction errors. Is this for the aerospike error itself for any reason or could it happen with a wrong query from user?


#2

Could you provide applicable portions of the benchmark log? Would like to see what errors were being reported by the client.


#3

Sure, I’m testing this on RHEL7.2 with aerospike community version.

# ./run_benchmarks -h 127.0.0.1 -p 3000 -n test -k 100000000 -b 1 -o B:1400 -w RU,50 -g 20000 -T 2 -z 50

Benchmark: 127.0.0.1 3000, namespace: test, set: testset, threads: 50, workload: READ_UPDATE
read: 50% (all bins: 100%, single bin: 0%), write: 50% (all bins: 100%, single bin: 0%)
keys: 100000000, start key: 0, transactions: 0, bins: 1, random values: false, throughput: 20000 tps
read policy:
    socketTimeout: 2, totalTimeout: 2, maxRetries: 2, sleepBetweenRetries: 0
    consistencyLevel: CONSISTENCY_ONE, replica: SEQUENCE, reportNotFound: false
write policy:
    socketTimeout: 2, totalTimeout: 2, maxRetries: 2, sleepBetweenRetries: 500
    commitLevel: COMMIT_ALL
Sync: connPoolsPerNode: 1
bin[0]: byte[1400]
debug: false
2017-06-13 14:53:43.525 INFO Thread main Add node BB9ACB2B93E16FA 127.0.0.1 3000
2017-06-13 14:53:44.900 write(tps=16595 timeouts=1891 errors=1178) read(tps=16403 timeouts=1945 errors=1059) total(tps=32998 timeouts=3836 errors=2237)
2017-06-13 14:53:45.901 write(tps=9920 timeouts=294 errors=9) read(tps=10130 timeouts=300 errors=12) total(tps=20050 timeouts=594 errors=21)
2017-06-13 14:53:46.901 write(tps=9884 timeouts=1861 errors=1156) read(tps=10169 timeouts=1891 errors=1095) total(tps=20053 timeouts=3752 errors=2251)
2017-06-13 14:53:47.903 write(tps=10034 timeouts=409 errors=98) read(tps=10102 timeouts=419 errors=88) total(tps=20136 timeouts=828 errors=186)
2017-06-13 14:53:48.903 write(tps=10134 timeouts=1035 errors=243) read(tps=9923 timeouts=998 errors=278) total(tps=20057 timeouts=2033 errors=521)
2017-06-13 14:53:49.904 write(tps=9999 timeouts=652 errors=78) read(tps=10073 timeouts=595 errors=73) total(tps=20072 timeouts=1247 errors=151)
2017-06-13 14:53:50.904 write(tps=9986 timeouts=741 errors=368) read(tps=10076 timeouts=727 errors=356) total(tps=20062 timeouts=1468 errors=724)
2017-06-13 14:53:51.904 write(tps=10124 timeouts=446 errors=164) read(tps=9924 timeouts=457 errors=213) total(tps=20048 timeouts=903 errors=377)

#4

If a read were to fail for reasons other than timeout or not_found then client_read_error is incremented, otherwise either client_read_timeout or client_read_not_found will increment.

Likewise, if a write were to fail for reasons other than timeout then client_write_error is incremented.

See Java Client error codes for a list of errors reported by the Java client.


#5

Okay thanks very much. My timeout parameter was too small. Now I can see the metrics having the non-zero values.


#6

If I follow, the server timeouts are likely less than the client timeouts because the server had sent the response and the client issued a client-side timeout while the response was in transit.