Aerospike Client timeout Error


#1

Hi,

I have been using aerospike single node cluster from an year in production and recently we have started getting error

com.aerospike.client.AerospikeException$Timeout: Client timeout: timeout=5000 iterations=1 failedNodes=0 failedConns=0 at com.aerospike.client.command.SyncCommand.execute(SyncCommand.java:131) at com.aerospike.client.AerospikeClient.get(AerospikeClient.java:485) at com.shuttl.base.cache.aerospike.AerospikeCache.get(AerospikeCache.java:63) at com.shuttl.base.cache.aerospike.AerospikeCache.get(AerospikeCache.java:52)

The load is very less only around 500K per seconds .

The configuration of the box on which aerospike is amazon ec2 - c4 large instance

Disk size - 20GB RAM size - 4 GB CPU - 2 core

and my config of aerospike looks like

# Aerospike database configuration file.

# service context definition
service {
  user root
  group root
  paxos-single-replica-limit 1
  pidfile /var/run/aerospike/asd.pid
  proto-fd-max 15000
  service-threads 4
  transaction-queues 4
  transaction-threads-per-queue 4
}

# logging context definition
logging {
  file /var/log/aerospike/aerospike.log {
    context any info
  }
}

# network context definition
network {
  service {
    address any
    port 3000
  }

  fabric {
    address any
    port 3001
  }

  info {
    address any
    port 3003
  }

  heartbeat {
    address any
    interval 150
    mode multicast
    port 9918
    timeout 10
  }
}


# namespace context: rms
namespace rms {
  default-ttl 30d
  memory-size 2G
  replication-factor 1
  storage-engine device {
    data-in-memory true
    file /opt/aerospike/data/rms1.dat
    filesize 4G
  }
}

namespace driver {
        replication-factor 1
        memory-size 512M
        default-ttl 600 # 30 days, use 0 to never expire/evict.

        storage-engine device {
               file /opt/aerospike/data/driver1.dat
               filesize 1G
               data-in-memory true # Store data in memory in addition to file.
        }
}

namespace vas_driver {
 	replication-factor 1
        memory-size 512M
        default-ttl 600 # 30 days, use 0 to never expire/evict.

        storage-engine device {
               file /opt/aerospike/data/vas_driver1.dat
               filesize 1G
               data-in-memory true # Store data in memory in addition to file.
        }
}

There is no error or warning in aerospike log also -

  1. have checked for stop_writes, hwm_breached because of any storage issue for namespace - no log
  2. Have also checked for “could not allocate storage” there is no as such issue .

From top command also - The utilization is only 1.5 GB for aerospike process .

I have further debugged this -

asinfo -v “statistics” -l -> this is the command to see all errors ,the variables which starts with err_ should be 0 (for new ones only)

figured out that the variable count - err_tsvc_requests is not 0 it is 201 for me . and is continuously increasing then i set the log at debug level using

asinfo -p 3000 -v ‘set-log:tsvc=debug’

It has started displaying the log line -

Jun 13 2017 09:54:00 GMT: DEBUG (tsvc): (thr_tsvc.c:as_rw_process_result:287) write start failed: rv -1 proto result 2

I have not able to get this what could be the possible issue ? Can someone please guide , where i should proceed ?


#2

client and server versions?


#3

Java client version is : 3.1.8

server version is : 3.7