How to set read.consistency_level in python aerospike client?

tombo · December 16, 2014, 12:51pm

Hi Is there a way to set “read.consistency_level=all” in python client in connection or in transaction ?

Is there anothers setting to force “read.consistency_level=all” only if cluster is replicating/synchronizing partition ?

– Kind Regards Marek Grzybowski

rbotzer · December 17, 2014, 7:56pm

We are working on exposing those policies for the underlying C-client in one of the next releases (depending on which branches get into master first). It’s close.

Ronen

tombo · December 18, 2014, 4:37pm

Excellent, this feature seems to be very important. Current default aerospike settings (write.commit_level=all, read.consistency_level=one) means that if some ssd storage node is back from downtime, it will return old values from ssd for all master records belonging to node. If You have counters on aerospike, and You increment them frequently, every aeorospike node restart means data loss. AFIK only way to awoid this beahevior is to setup global read.consistency_level=all, or every time node is down wipe ssd disks. Or maybe there is another way ?

PS: We wrote some nagios plugins, to test our aerospike cluster: GitHub - RTBHOUSE/check_aerospike_put_get: Nagios plugin that gets and put data to aerospike to assess potential data loss after node failures. GitHub - RTBHOUSE/check_naglio_aerospike: Nagios plugins that parse output from aerospike tools.

Mnemaudsyne · December 23, 2014, 9:56pm

Hi Marek,

Thank you for your follow-up post. We are looking into your issue and will get back to you soon. Thank you for your patience.

Regards,

Maud

rbotzer · December 27, 2014, 12:26am

Hi Marek,

If a node is recovering from being partitioned an automatic process will resolve write duplicates by assuming the copy with the most recent timestamp is the canonical one - unless you overwrote the config parameter write-duplicate-resolution-disable to true. If the client sends a write to the wrong node during cluster reconfiguration because its not yet in sync about the new owner of the record the cluster will proxy the write to the correct node.

Both situations are rare and handled by the cluster.

See more here: http://www.aerospike.com/docs/architecture/data-distribution.html and http://www.aerospike.com/docs/reference/configuration/#write-duplicate-resolution-disable

tombo · December 27, 2014, 3:32pm

Hi Ronen Steps to reproduce my tests on clear three node cluster ( i currently do not have spare ssd servers, so i used containers and file backstore instead, result is the same ) :

ii  aerospike-server-community            3.4.0-1                             The Aerospike distributed datastore allows fully scalable and reliable data storage with elastic server properties.
ii  aerospike-tools                       3.4.0                               Aerospike server tools.

aerospike.conf:

namespace test {
        replication-factor 2
        memory-size 2G
        default-ttl 30d # 30 days, use 0 to never expire/evict.

        storage-engine memory

        storage-engine device {
                file /opt/aerospike/data/test.dat
                filesize 16G
                data-in-memory true # Store data in memory in addition to file.
        }
}

start clear cluster with namespace “test” that have persistant storage ( file or SSD )
check config ( to make sure “write-duplicate-resolution-disable” is not enabled ):

asinfo -v ‘get-config:’ requested value get-config: value is transaction-queues=4;transaction-threads-per-queue=4;transaction-duplicate-threads=0;transaction-pending-limit=20;migrate-threads=1;migrate-xmit-priority=40;migrate-xmit-sleep=500;migrate-read-priority=10;migrate-read-sleep=500;migrate-xmit-hwm=10;migrate-xmit-lwm=5;migrate-max-num-incoming=256;migrate-rx-lifetime-ms=60000;proto-fd-max=15000;proto-fd-idle-ms=60000;transaction-retry-ms=1000;transaction-max-ms=1000;transaction-repeatable-read=false;dump-message-above-size=134217728;ticker-interval=10;microbenchmarks=false;storage-benchmarks=false;scan-priority=200;scan-sleep=1;batch-threads=4;batch-max-requests=5000;batch-priority=200;nsup-delete-sleep=0;nsup-period=120;nsup-startup-evict=true;paxos-retransmit-period=5;paxos-single-replica-limit=1;paxos-max-cluster-size=32;paxos-protocol=v3;paxos-recovery-policy=manual;write-duplicate-resolution-disable=false;respond-client-on-master-completion=false;replication-fire-and-forget=false;info-threads=16;allow-inline-transactions=true;use-queue-per-device=false;snub-nodes=false;fb-health-msg-per-burst=0;fb-health-msg-timeout=200;fb-health-good-pct=50;fb-health-bad-pct=0;auto-dun=false;auto-undun=false;prole-extra-ttl=0;max-msgs-per-type=-1;pidfile=/var/run/aerospike/asd.pid;memory-accounting=false;udf-runtime-gmax-memory=18446744073709551615;udf-runtime-max-memory=18446744073709551615;sindex-populator-scan-priority=3;sindex-data-max-memory=18446744073709551615;query-threads=6;query-worker-threads=15;query-priority=10;query-in-transaction-thread=0;query-req-in-query-thread=0;query-req-max-inflight=100;query-bufpool-size=256;query-batch-size=100;query-sleep=1;query-job-tracking=false;query-short-q-max-size=500;query-long-q-max-size=500;query-rec-count-bound=4294967295;query-threshold=10;query-untracked-time=1000000;service-address=0.0.0.0;service-port=3000;mesh-address=172.16.9.37;mesh-port=3002;reuse-address=true;fabric-port=3001;network-info-port=3003;enable-fastpath=true;heartbeat-mode=mesh;heartbeat-protocol=v2;heartbeat-address=172.16.9.37;heartbeat-port=3002;heartbeat-interval=150;heartbeat-timeout=10;enable-security=false;privilege-refresh-period=300;report-authentication-sinks=0;report-sys-admin-sinks=0;report-user-admin-sinks=0;report-violation-sinks=0;syslog-local=-1;xdr-delete-shipping-enabled=true;xdr-nsup-deletes-enabled=false;enable-xdr=false;stop-writes-noxdr=false;reads-hist-track-back=1800;reads-hist-track-slice=10;reads-hist-track-thresholds=1,8,64;writes_master-hist-track-back=1800;writes_master-hist-track-slice=10;writes_master-hist-track-thresholds=1,8,64;proxy-hist-track-back=1800;proxy-hist-track-slice=10;proxy-hist-track-thresholds=1,8,64;writes_reply-hist-track-back=1800;writes_reply-hist-track-slice=10;writes_reply-hist-track-thresholds=1,8,64;udf-hist-track-back=1800;udf-hist-track-slice=10;udf-hist-track-thresholds=1,8,64;query-hist-track-back=1800;query-hist-track-slice=10;query-hist-track-thresholds=1,8,64;query_rec_count-hist-track-back=1800;query_rec_count-hist-track-slice=10;query_rec_count-hist-track-thresholds=1,8,64

asinfo -v ‘get-config:context=namespace;id=test’ requested value get-config:context=namespace;id=test value is ;memory-size=2147483648;high-water-disk-pct=50;high-water-memory-pct=60;evict-tenths-pct=5;stop-writes-pct=90;cold-start-evict-ttl=4294967295;repl-factor=2;default-ttl=2592000;max-ttl=0;conflict-resolution-policy=generation;allow_versions=false;single-bin=false;ldt-enabled=false;enable-xdr=falsesets-enable-xdr=trueforward-xdr-writes=false;disallow-null-setname=false;total-bytes-memory=2147483648;read-consistency-level-override=off;write-commit-level-override=off;total-bytes-disk=17179869184;defrag-lwm-pct=50;defrag-queue-min=0;defrag-sleep=1000;defrag-startup-minimum=10;flush-max-ms=1000;fsync-max-sec=0;write-smoothing-period=0;max-write-cache=67108864;min-avail-pct=5;post-write-queue=0;data-in-memory=true;file=/opt/aerospike/data/test.dat;filesize=17179869184;writethreads=1;writecache=67108864;obj-size-hist-max=100
put some several key-values to the cluster:

git clone GitHub - RTBHOUSE/check_aerospike_put_get: Nagios plugin that gets and put data to aerospike to assess potential data loss after node failures. ./check_aerospike_put_get/check_aerospike_put_get.py -i
increment values ( get , increment, put )

./check_aerospike_put_get/check_aerospike_put_get.py
down single cluster node ( you can wait since cluster finish migrate all partition, but it does not mather )
increment values in loop

for x in {1…30} ; do ./check_aerospike_put_get/check_aerospike_put_get.py ; done
start node that were down in time when You icrement values

aql -c ‘select * from test’ ±-----------+ | nagios-bin | ±-----------+ | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 88 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 88 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | | 103 | ±-----------+ 30 rows in set (0.062 secs)

In my test two values are different than others.

There is time window when python client reads old values from node that is getting up. The more data in partitions, the longer time window when client reads old records. Writes/puts are always go in the right place, rekord generation is the same for all values.

rbotzer · December 29, 2014, 9:58pm

I’ll look to recreate the problem using your code. Thanks.

rbotzer · January 13, 2015, 9:47pm

I will get back to reproducing your problem. Sorry about the delay.

First, though, I wanted to point out that the read consistency level and read replica policies can be controlled by passing a policy with the get() method, as seen in test/test_get.py.

The policy field ‘consistency’ can take on the values aerospike.POLICY_CONSISTENCY_ONE (default) or aerospike.POLICY_CONSISTENCY_ALL.
The ‘replica’ field can take on the values aerospike.POLICY_REPLICA_MASTER (default) or aerospike.POLICY_REPLICA_ANY.

The write commit level policy can be controlled by passing a policy containing it to the put() method, as seen in test/test_put.py.

The policy field ‘commit_level’ can take on the values aerospike.POLICY_COMMIT_LEVEL_ALL (default) or aerospike.POLICY_COMMIT_LEVEL_MASTER.

Topic		Replies	Views
Cluster synchronization: re-write keys Tuning	7	4693	August 18, 2014
Default write consistency not working as expected	1	1138	January 6, 2017
Stale Data Comes Up on Node restart temporarily How Aerospike Works	3	3144	March 21, 2017
Data inconsistency after failed node back Tuning	7	4173	November 14, 2014
Unexpected behavior on EC2 Installation	1	1428	August 18, 2014

How to set read.consistency_level in python aerospike client?

Related topics