The interval to flush data into Disk


#1

by Hanson » Wed Jul 30, 2014 10:41 pm

I’m running pure Write traffic (30K TPS, 1KB record size) with Aerospike 3.3.8 In-Memory + Disk Persistence. The “iostat” shows ~15 seconds of ~120MB/s disk writing for every interval of 60 seconds, and 0 MB/s for the rest of 60-15 = 45 seconds.

Any parameters can be tuned to let the disk activity smoothly? (spread out the disk writing as 15*120/60 = 30MB/s, instead of spikes as 120MB/s)

The purpose:

  • Reduce the data loss in case of power lost, trade-off the performance and Durability of ACID.
  • On Cloud to reduce the disk I/O impact for VMs each other in case of two VMs having block storage on the same physical disk.

#2

Speeding up defragmentation should help spread out the disk io.

Try setting these to a value of 1:

defrag-period=1

defrag-queue-priority=1

These changes can be made permanent in the aerospike.conf file.

Or you can dynamically test it using asinfo:

asinfo -v 'set-config:context=namespace;id=test;defrag-period=1’ asinfo -v ‘set-config:context=service;id=test;defrag-queue-priority=1’

More info on these settings and others can be found at :

http://www.aerospike.com/docs/reference/configuration/


#3

There is no parameter “defrag-period” in Aerospike 3.3.8. Only found followings: defrag-queue-hwm=500;defrag-queue-lwm=1;defrag-queue-escape=10;defrag-queue-priority=1

Here are the full list:

===================================
[root@localhost]# asinfo -v 'get-config:context=service'
requested value get-config:context=service
value is transaction-queues=4;transaction-threads-per-queue=4;transaction-duplicate-threads=0;transaction-pending-limit=20;
migrate-threads=1;migrate-priority=40;migrate-xmit-priority=40;migrate-xmit-sleep=500;migrate-read-priority=10;migrate-read-sleep=500;
migrate-xmit-hwm=10;migrate-xmit-lwm=5;migrate-max-num-incoming=256;migrate-rx-lifetime-ms=60000;proto-fd-max=15000;
proto-fd-idle-ms=60000;transaction-retry-ms=1000;transaction-max-ms=1000;transaction-repeatable-read=false;dump-message-above-size=134217728;ticker-interval=10;microbenchmarks=false;storage-benchmarks=false;scan-priority=200;scan-sleep=1;batch-threads=4;
batch-max-requests=5000;batch-priority=200;nsup-period=120;nsup-queue-hwm=500;nsup-queue-lwm=1;nsup-queue-escape=10;defrag-queue-hwm=500;defrag-queue-lwm=1;defrag-queue-escape=10;defrag-queue-priority=1;nsup-auto-hwm-pct=15;nsup-startup-evict=true;
paxos-retransmit-period=5;paxos-single-replica-limit=1;paxos-max-cluster-size=32;paxos-protocol=v3;paxos-recovery-policy=manual;
write-duplicate-resolution-disable=false;respond-client-on-master-completion=false;replication-fire-and-forget=false;info-threads=16;
allow-inline-transactions=true;use-queue-per-device=false;snub-nodes=false;fb-health-msg-per-burst=0;fb-health-msg-timeout=200;
fb-health-good-pct=50;fb-health-bad-pct=0;auto-dun=false;auto-undun=false;prole-extra-ttl=0;max-msgs-per-type=-1;
pidfile=/var/run/aerospike/asd.pid;memory-accounting=false;udf-runtime-gmax-memory=18446744073709551615;
udf-runtime-max-memory=18446744073709551615;sindex-populator-scan-priority=3;sindex-data-max-memory=18446744073709551615;
query-threads=6;query-worker-threads=15;query-priority=10;query-in-transaction-thread=0;query-req-in-query-thread=0;
query-req-max-inflight=100;query-bufpool-size=256;query-batch-size=100;query-sleep=1;query-job-tracking=false;
query-short-q-max-size=500;query-long-q-max-size=500;query-rec-count-bound=4294967295;query-threshold=10

#4

by Hanson » Tue Aug 05, 2014 11:58 pm

The problem is: no disk writing during that 45 seconds, that means all the data changed in memory for such a long time will be lost if power down. That is a large amount of data when with high Insert traffic: 30K TPS * 45s = 1350K records! Could the data loss be minimized?