Hi,
I have been using aerospike single node cluster from an year in production and recently we have started getting error
com.aerospike.client.AerospikeException$Timeout: Client timeout: timeout=5000 iterations=1 failedNodes=0 failedConns=0 at com.aerospike.client.command.SyncCommand.execute(SyncCommand.java:131) at com.aerospike.client.AerospikeClient.get(AerospikeClient.java:485) at com.shuttl.base.cache.aerospike.AerospikeCache.get(AerospikeCache.java:63) at com.shuttl.base.cache.aerospike.AerospikeCache.get(AerospikeCache.java:52)
The load is very less only around 500K per seconds .
The configuration of the box on which aerospike is amazon ec2 - c4 large instance
Disk size - 20GB RAM size - 4 GB CPU - 2 core
and my config of aerospike looks like
# Aerospike database configuration file.
# service context definition
service {
user root
group root
paxos-single-replica-limit 1
pidfile /var/run/aerospike/asd.pid
proto-fd-max 15000
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
}
# logging context definition
logging {
file /var/log/aerospike/aerospike.log {
context any info
}
}
# network context definition
network {
service {
address any
port 3000
}
fabric {
address any
port 3001
}
info {
address any
port 3003
}
heartbeat {
address any
interval 150
mode multicast
port 9918
timeout 10
}
}
# namespace context: rms
namespace rms {
default-ttl 30d
memory-size 2G
replication-factor 1
storage-engine device {
data-in-memory true
file /opt/aerospike/data/rms1.dat
filesize 4G
}
}
namespace driver {
replication-factor 1
memory-size 512M
default-ttl 600 # 30 days, use 0 to never expire/evict.
storage-engine device {
file /opt/aerospike/data/driver1.dat
filesize 1G
data-in-memory true # Store data in memory in addition to file.
}
}
namespace vas_driver {
replication-factor 1
memory-size 512M
default-ttl 600 # 30 days, use 0 to never expire/evict.
storage-engine device {
file /opt/aerospike/data/vas_driver1.dat
filesize 1G
data-in-memory true # Store data in memory in addition to file.
}
}
There is no error or warning in aerospike log also -
- have checked for stop_writes, hwm_breached because of any storage issue for namespace - no log
- Have also checked for âcould not allocate storageâ there is no as such issue .
From top command also - The utilization is only 1.5 GB for aerospike process .
I have further debugged this -
asinfo -v âstatisticsâ -l â this is the command to see all errors ,the variables which starts with err_ should be 0 (for new ones only)
figured out that the variable count - err_tsvc_requests is not 0 it is 201 for me . and is continuously increasing then i set the log at debug level using
asinfo -p 3000 -v âset-log:tsvc=debugâ
It has started displaying the log line -
Jun 13 2017 09:54:00 GMT: DEBUG (tsvc): (thr_tsvc.c:as_rw_process_result:287) write start failed: rv -1 proto result 2
I have not able to get this what could be the possible issue ? Can someone please guide , where i should proceed ?