I am running a single node cluster in my local box - 64bit ubuntu and non-SSD. The aerospike.conf is configured to store the data in the hard drive.
I hit the timeout error way too often. Almost every 2-3 mins.
Let me know if you have faced the same.
kporter
December 16, 2014, 7:02pm
2
Is a this data-in-memory namespace?
Could you provide a copy of /etc/aerospike/aerospike.conf
?`
Could you provide a link to your Aerospike logs?
Are you running your client from the same node?
How much load are you pushing?
No the data is not stored in memory primarily. The data is stored in the drive and there data-in-memory
flag enabled to true
which stores the data in memory in addition to file.
.conf
# Aerospike database configuration file.
# This stanza must come first.
service {
user root
group root
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
pidfile /var/run/aerospike/asd.pid
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
proto-fd-max 15000
}
logging {
# Log file must be an absolute path.
file /var/log/aerospike/aerospike.log {
context any info
}
}
network {
service {
address any
port 3000
}
heartbeat {
mode multicast
address 239.1.99.222
port 9918
# To use unicast-mesh heartbeats, comment out the 3 lines above and
# use the following 4 lines instead.
# mode mesh
# port 3002
# mesh-address 10.1.1.1
# mesh-port 3002
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace test {
replication-factor 2
memory-size 4G
default-ttl 30d # 30 days, use 0 to never expire/evict.
storage-engine memory
}
namespace bar {
replication-factor 2
memory-size 4G
default-ttl 30d # 30 days, use 0 to never expire/evict.
#storage-engine memory
# To use file storage backing, comment out the line above and use the
# following lines instead.
storage-engine device {
file /opt/aerospike/data/bar.dat
filesize 16G
data-in-memory true # Store data in memory in addition to file.
}
}
namespace polls {
replication-factor 1
memory-size 4G
default-ttl 30d # 30 days, use 0 to never expire/evict.
#storage-engine memory
# To use file storage backing, comment out the line above and use the
# following lines instead.
storage-engine device {
file /opt/aerospike/data/polls.dat
filesize 16G
data-in-memory true # Store data in memory in addition to file.
}
}
The namspace I actively use is ‘polls’(please scroll down in the .conf file to see the configurations of the namespace ‘polls’)
Yes the client is on the same node.
I am not pushing a lot of load. Just a bunch of lists and maps that lengths up to max 10-20 items.
Thanks for your reply.
Running the client on the same node can result in increased timeouts, especially if the client process is resource intensive.
Likewise, a very low transaction throughput can also result in timeouts if the rate is so low that sockets are being closed.
What are you setting your timeout to?
What are you setting your timeout to?
Settings for timeout connections is 1000 ms.
I have increased the timeout and tried, but that doesn’t help.
Running the client on the same node can result in increased timeouts,
especially if the client process is resource intensive.
Yes, I am running the client and server on the same node(for development of course) but the load is very very less. I am not sure how this is causing such timeouts so very often.
Hm, could you provide the output of:
asadm -h HOST_IP -e "show stat; asinfo -v build"
When I run the command I get the following outout
~~~ stat ~~~
ERR: Unable to find any Aerospike nodes
~~~ -v build ~~~
ERR: Unable to find any Aerospike nodes
Aerospike version is 3.5.9
Operating System - ubuntu 14.04
Non-SSD, 64 bit machine.
Did you replace “HOST_IP” with the hosts IP?
If you did can you check if Aerospike is still running on that host?
If you are running the command on the host itself then the HOST_IP will default to localhost:
E.G. The following would seed from localhost.
asadm -e "show stat; asinfo -v build"
Sorry about that, here it is
~~~ stat ~~~ ~~test Bin Statistics~~~
NODE : 1
bin-names-quota: 32768
num-bin-names : 0
~~polls Bin Statistics~~
NODE : 1
bin-names-quota: 32768
num-bin-names : 25
~~~bar Bin Statistics~~~
NODE : 1
bin-names-quota: 32768
num-bin-names : 17
~~bar notification Set Statistics~~~
NODE : 1
n_objects : 5
ns_name : bar
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : notification
~~bar active_users Set Statistics~~~
NODE : 1
n_objects : 1
ns_name : bar
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : active_users
~~~~polls invite Set Statistics~~~~
NODE : 1
n_objects : 1
ns_name : polls
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : invite
~~~bar email_sent Set Statistics~~~
NODE : 1
n_objects : 1
ns_name : bar
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : email_sent
~polls notification Set Statistics~~
NODE : 1
n_objects : 1
ns_name : polls
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : notification
~~~~~polls feed Set Statistics~~~~~
NODE : 1
n_objects : 3
ns_name : polls
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : feed
~~~~~~bar post Set Statistics~~~~~~
NODE : 1
n_objects : 24
ns_name : bar
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : post
~~~~~polls post Set Statistics~~~~~ NODE : 1 n_objects : 5 ns_name : polls set-delete : false set-enable-xdr : use-default set-evict-hwm-count : 0 set-stop-write-count: 0 set_name : post
~~polls user_people_state Set Statistics~
NODE : 1
n_objects : 1
ns_name : polls
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : user_people_state
~polls active_users Set Statistics~~
NODE : 1
n_objects : 1
ns_name : polls
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : active_users
~~~~~~bar feed Set Statistics~~~~~~
NODE : 1
n_objects : 7
ns_name : bar
set-delete : false
set-enable-xdr : use-default
set-evict-hwm-count : 0
set-stop-write-count: 0
set_name : feed
~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : 1 batch_errors : 0 batch_initiate : 0 batch_queue : 0 batch_timeout : 0 batch_tree_count : 0 client_connections : 7 cluster_integrity : true cluster_key : D02C51D2614BF9A1 cluster_size : 1
data-used-bytes-memory : 15695 delete_queue : 0 err_duplicate_proxy_request : 0 err_out_of_space : 0 err_replica_non_null_node : 0 err_replica_null_node : 0 err_rw_cant_put_unique : 0 err_rw_pending_limit : 0 err_rw_request_not_found : 0 err_storage_queue_full : 0 err_sync_copy_null_master : 0 err_sync_copy_null_node : 0 err_tsvc_requests : 0 err_write_fail_bin_exists : 0 err_write_fail_bin_name : 0 err_write_fail_bin_not_found : 0 err_write_fail_forbidden : 0 err_write_fail_generation : 0 err_write_fail_generation_xdr : 0 err_write_fail_incompatible_type : 0 err_write_fail_key_exists : 0 err_write_fail_key_mismatch : 0 err_write_fail_not_found : 0 err_write_fail_noxdr : 0 err_write_fail_parameter : 0 err_write_fail_prole_delete : 0 err_write_fail_prole_generation : 0 err_write_fail_prole_unknown : 0 err_write_fail_record_too_big : 0 err_write_fail_unknown : 0 fabric_msgs_rcvd : 0 fabric_msgs_sent : 0 free-pct-disk : 99
free-pct-memory : 99 heartbeat_received_foreign : 0 heartbeat_received_self : 299
index-used-bytes-memory : 3200 info_queue : 0 migrate_msgs_recv : 0 migrate_msgs_sent : 0 migrate_num_incoming_accepted : 0 migrate_num_incoming_refused : 0 migrate_progress_recv : 0 migrate_progress_send : 0 migrate_rx_objs : 0 migrate_tx_objs : 0 objects : 50 ongoing_write_reqs : 0 partition_absent : 0 partition_actual : 12288 partition_desync : 0 partition_object_count : 50 partition_ref_count : 12288 partition_replica : 0 paxos_principal : BB995C9E9272D14 proxy_action : 0 proxy_in_progress : 0 proxy_initiate : 0 proxy_retry : 0 proxy_retry_new_dest : 0 proxy_retry_q_full : 0 proxy_retry_same_dest : 0 proxy_unproxy : 0 query_abort : 0 query_agg : 0 query_agg_abort : 0 query_agg_avg_rec_count : 0 query_agg_err : 0 query_agg_success : 0 query_avg_rec_count : 0 query_fail : 0 query_long_queue_full : 0 query_long_queue_size : 0 query_long_running : 0 query_lookup_abort : 0 query_lookup_avg_rec_count : 0 query_lookup_err : 0 query_lookup_success : 0 query_lookups : 0 query_reqs : 0 query_short_queue_full : 0 query_short_queue_size : 0 query_short_running : 0 query_success : 0 query_tracked : 0 queue : 0 read_dup_prole : 0 reaped_fds : 0 record_locks : 0 record_refs : 50 rw_err_ack_badnode : 0 rw_err_ack_internal : 0 rw_err_ack_nomatch : 0 rw_err_dup_cluster_key : 0 rw_err_dup_internal : 0 rw_err_dup_send : 0 rw_err_write_cluster_key : 0 rw_err_write_internal : 0 rw_err_write_send : 0
sindex-used-bytes-memory : 203070 sindex_gc_activity_dur : 0 sindex_gc_garbage_cleaned : 0 sindex_gc_garbage_found : 0 sindex_gc_inactivity_dur : 20000 sindex_gc_list_creation_time : 0 sindex_gc_list_deletion_time : 0 sindex_gc_locktimedout : 0 sindex_gc_objects_validated : 0 sindex_ucgarbage_found : 0 stat_cluster_key_err_ack_dup_trans_reenqueue : 0 stat_cluster_key_err_ack_rw_trans_reenqueue : 0 stat_cluster_key_partition_transaction_queue_count: 0 stat_cluster_key_prole_retry : 0 stat_cluster_key_regular_processed : 0 stat_cluster_key_trans_to_proxy_retry : 0 stat_cluster_key_transaction_reenqueue : 0 stat_delete_success : 0 stat_deleted_set_objects : 0 stat_duplicate_operation : 0 stat_evicted_objects : 0 stat_evicted_objects_time : 0 stat_evicted_set_objects : 0 stat_expired_objects : 0 stat_ldt_proxy : 0 stat_nsup_deletes_not_shipped : 0 stat_proxy_errs : 0 stat_proxy_reqs : 0 stat_proxy_reqs_xdr : 0 stat_proxy_success : 0 stat_read_errs_notfound : 0 stat_read_errs_other : 0 stat_read_reqs : 0 stat_read_reqs_xdr : 0 stat_read_success : 0 stat_rw_timeout : 0 stat_slow_trans_queue_batch_pop : 0 stat_slow_trans_queue_pop : 0 stat_slow_trans_queue_push : 0 stat_write_errs : 0 stat_write_errs_notfound : 0 stat_write_errs_other : 0 stat_write_reqs : 0 stat_write_reqs_xdr : 0 stat_write_success : 0 stat_xdr_pipe_miss : 0 stat_xdr_pipe_writes : 0 stat_zero_bin_records : 0 storage_defrag_corrupt_record : 0 sub-records : 0 system_free_mem_pct : 77 system_swapping : false
total-bytes-disk : 34359738368
total-bytes-memory : 12884901888 transactions : 21 tree_count : 0 tscan_aborted : 0 tscan_initiate : 0 tscan_pending : 0 tscan_succeeded : 0 udf_delete_err_others : 0 udf_delete_reqs : 0 udf_delete_success : 0 udf_lua_errs : 0 udf_query_rec_reqs : 0 udf_read_errs_other : 0 udf_read_reqs : 0 udf_read_success : 0 udf_replica_writes : 0 udf_scan_rec_reqs : 0 udf_write_err_others : 0 udf_write_reqs : 0 udf_write_success : 0 uptime : 31 used-bytes-disk : 27776
used-bytes-memory : 221965 waiting_transactions : 0 write_master : 0 write_prole : 0
~~~~~~~~~~~~~~~~~~~bar Namespace Statistics~~~~~~~~~~~~~~~~~~~
NODE : 1
allow_versions : false
available-bin-names : 32751
available_pct : 99
cold-start-evict-ttl : 4294967295
conflict-resolution-policy : generation
current-time : 169017220
data-in-memory : true
data-used-bytes-memory : 13720
default-ttl : 2592000
defrag-lwm-pct : 50
defrag-queue-min : 0
defrag-sleep : 1000
defrag-startup-minimum : 10
disallow-null-setname : false
enable-xdr : false
evict-tenths-pct : 5
evicted-objects : 0
expired-objects : 0
file : /opt/aerospike/data/bar.dat
filesize : 17179869184
flush-max-ms : 1000
free-pct-disk : 99
free-pct-memory : 99
fsync-max-sec : 0
high-water-disk-pct : 50
high-water-memory-pct : 60
hwm-breached : false
index-used-bytes-memory : 2432
ldt-enabled : false
ldt-page-size : 8192
master-objects : 38
master-sub-objects : 0
max-ttl : 0
max-void-time : 171525071
max-write-cache : 67108864
memory-size : 4294967296
min-avail-pct : 5
non-expirable-objects : 0
ns-forward-xdr-writes : false
nsup-cycle-duration : 0
nsup-cycle-sleep-pct : 0
obj-size-hist-max : 100
objects : 38
post-write-queue : 0
prole-objects : 0
prole-sub-objects : 0
read-consistency-level-override: off
repl-factor : 1
set-deleted-objects : 0
set-evicted-objects : 0
sets-enable-xdr : true
sindex-used-bytes-memory : 0
single-bin : false
stop-writes : false
stop-writes-pct : 90
sub-objects : 0
total-bytes-disk : 17179869184
total-bytes-memory : 4294967296
type : device
used-bytes-disk : 23168
used-bytes-memory : 16152
write-commit-level-override : off write-smoothing-period
~~~~~~~~~~~~~~~~~~~polls Namespace Statistics~~~~~~~~~~~~~~~~~~~
NODE : 1
allow_versions : false
available-bin-names : 32743
available_pct : 99
cold-start-evict-ttl : 4294967295
conflict-resolution-policy : generation
current-time : 169017220
data-in-memory : true
data-used-bytes-memory : 1975
default-ttl : 0
defrag-lwm-pct : 50
defrag-queue-min : 0
defrag-sleep : 1000
defrag-startup-minimum : 10
disallow-null-setname : false
enable-xdr : false
evict-tenths-pct : 5
evicted-objects : 0
expired-objects : 0
file : /opt/aerospike/data/polls.dat
filesize : 17179869184
flush-max-ms : 1000
free-pct-disk : 99
free-pct-memory : 99
fsync-max-sec : 0
high-water-disk-pct : 50
high-water-memory-pct : 60
hwm-breached : false
index-used-bytes-memory : 768
ldt-enabled : false
ldt-page-size : 8192
master-objects : 12
master-sub-objects : 0
max-ttl : 0
max-void-time : 0
max-write-cache : 67108864
memory-size : 4294967296
min-avail-pct : 5
non-expirable-objects : 0
ns-forward-xdr-writes : false
nsup-cycle-duration : 0
nsup-cycle-sleep-pct : 0
obj-size-hist-max : 100
objects : 12
post-write-queue : 0
prole-objects : 0
prole-sub-objects : 0
read-consistency-level-override: off
repl-factor : 1
set-deleted-objects : 0
set-evicted-objects : 0
sets-enable-xdr : true
sindex-used-bytes-memory : 203070
single-bin : false
stop-writes : false
stop-writes-pct : 90
sub-objects : 0
total-bytes-disk : 17179869184
total-bytes-memory : 4294967296
type : device
~~~~~~~~~~test Namespace Statistics~~~~~~~~~~
NODE : 1
allow_versions : false
available-bin-names : 32768
cold-start-evict-ttl : 4294967295
conflict-resolution-policy : generation
current-time : 169017220
data-used-bytes-memory : 0
default-ttl : 2592000
disallow-null-setname : false
enable-xdr : false
evict-tenths-pct : 5
evicted-objects : 0
expired-objects : 0
free-pct-memory : 100
high-water-disk-pct : 50
high-water-memory-pct : 60
hwm-breached : false
index-used-bytes-memory : 0
ldt-enabled : false
ldt-page-size : 8192
master-objects : 0
master-sub-objects : 0
max-ttl : 0
max-void-time : 0
memory-size : 4294967296
non-expirable-objects : 0
ns-forward-xdr-writes : false
nsup-cycle-duration : 0
nsup-cycle-sleep-pct : 0
objects : 0
prole-objects : 0
prole-sub-objects : 0
read-consistency-level-override: off
repl-factor : 1
set-deleted-objects : 0
set-evicted-objects : 0
sets-enable-xdr : true
sindex-used-bytes-memory : 0
single-bin : false
stop-writes : false
stop-writes-pct : 90
sub-objects : 0
total-bytes-memory : 4294967296
type : memory
used-bytes-memory : 0
write-commit-level-override : off
~~~ -v build ~~~ 1 (192.168.1.5) returned:
3.5.9
Polls
is the namespace I am currently using.
There aren’t any timeouts appearing in these stats, also the uptime is only 31 seconds. Could you provide the stats of the server after the issue has been observed?
Also I should have asked for the latency histograms from the logs.
The service stats should be enough, could you run and provide the output of the following after the issue has been observed?
asadm -e "show stat service"
tail -n 10000 /var/log/aerospike/aerospike.log | grep "hist.c"
Here’s the hist -
I couldnt paste the out put here as it too long.
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::137) histogram dump: reads (14848 total) msec
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::163) (00: 0000014848)
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (359 total) msec
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::163) (00: 0000000359)
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::137) histogram dump: writes_reply (364 total) msec
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::163) (00: 0000000364)
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec
May 14 2015 12:17:01 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count
All 14848 reads and 359 writes succeeded in <= 1 ms. But these histograms are for the successful reads and writes.
Could you provide the output of the following after the timeouts have been observed?
asadm -e "show stat service"
Also which client are you using?
Sure, I’ll do that the next time I have this error. It’s not been happening for a while now.