Hello, I am debugging a scenario where we see high number of Aerospike Java client timeout exceptions.
The client is a Spring Boot REST webservice running in EC2 instances (EBS). These exceptions appear more during the time when an EC2 instance is added to the cluster. The read latency is very high and thus more reads are exceeding the timeout threshold and this exception is thrown.
I wanted to check in this forum if the following are possible causes:
-
Does the Java client have a cache warming phase? I think this is unlikely, but wanted to check.
-
The REST service was writing and reading the data. To reduce the load, we have moved the write operation to a Spark job on AWS EMR that writes to Aerospike. I started seeing this issue after moving the write operations to this EMR Spark cluster. Could read latency be affected if a large dataset is added to Aerospike outside the Java client?
Please give any suggestions to tackle this. The namespace configuration is:
namespace t1 {
replication-factor 2
memory-size 25G
high-water-memory-pct 70
high-water-disk-pct 60
default-ttl 4d
single-bin true
partition-tree-sprigs 4096
storage-engine memory
}
Here is the full stacktrace.
com.aerospike.client.AerospikeException$Timeout: Client timeout: timeout=30 iterations=1 lastNode=BB90B6F2699290E 10.1.99.205 3000
at com.aerospike.client.async.NettyCommand.totalTimeout(NettyCommand.java:513)
at com.aerospike.client.async.NettyCommand.timeout(NettyCommand.java:476)
at com.aerospike.client.async.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:146)
at com.aerospike.client.async.HashedWheelTimer$HashedWheelTimeout.access$700(HashedWheelTimer.java:125)
at com.aerospike.client.async.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:186)
at com.aerospike.client.async.HashedWheelTimer.run(HashedWheelTimer.java:118)
at com.aerospike.client.async.ScheduleTask.run(ScheduleTask.java:40)
at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Thanks for reading.