How to rewind XDR for a namespace

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

How to rewind XDR for a namespace

Context

Starting with version 5.0 of the Aerospike EE server, the Cross Datacenter Replication (XDR) subsystem supports the ability to rewind and re-ship records to an XDR destination when dynamically associating or statically configuring a namespace to a particular DC.

When a datacenter is added dynamically, XDR will start shipping from that point on. In order to ship existing records you would need to use the rewind feature as described below. Another option is to statically configure the new DC and do a rolling restart. Upon the restart the node will start shipping all existing records. This static auto-rewind feature is available on XDR 5 with the exception of some intermediate builds that had an optimization introduced in Aerospike server version 5.1.0.3 (AER-6240) and which was reverted (AER-6365) in versions 5.3.0.6, 5.2.0.15, 5.1.0.23.

Method

We can use the rewind feature to ship all existing records of a namespace. When rewinding a namespace, XDR will scan through the index and ship all the records for that namespace, partition by partition. This is done by having XDR go through what is called ‘recovery mode’. If the namespace is large, the recoveries may take some time. If partitions fall back into recoveries over and over again due to a high incoming throughput, the max-recoveries-interleaved configuration parameter can be used to force the recoveries to a limited number of partitions at a time to make sure a partition does not fall back into recoveries after it completes a round.

The rewind feature can also be used to only ship records that were last updated after a certain point in time.

Below are the relevant commands.

Re-ship all records in a new DC’s namespace:

This can be done when initially associating that namespace to a DC with the following command:

asinfo -v "set-config:context=xdr;dc=<DC_NAME>;namespace=<NAMESPACE>;action=add;rewind=all"

Re-ship all (or some) records in an existing DC’s namespace:

First we have to disassociate the namespace from the DC:

asinfo -v "set-config:context=xdr;dc=<DC_NAME>;namespace=<NAMESPACE>;action=remove"

then associate the namespace to the DC with the rewind option set to “all”:

asinfo -v "set-config:context=xdr;dc=<DC_NAME>;namespace=<NAMESPACE>;action=add;rewind=all"

or you can specify the number of seconds from now to go back and ship records:

asinfo -v "set-config:context=xdr;dc=<DC_NAME>;namespace=<NAMESPACE>;action=add;rewind=<NUMBER_SECONDS>"

Re-ship specific sets to a new namespace:

In some cases when associating (or re-associating) a namespace to a DC you may be able to restrict shipping to only specific sets. This feature could be used to rewind and ship records for these sets in some use cases.

The first step would be to disassociate the namespace from the DC:

asinfo -v "set-config:context=xdr;dc=<DC_NAME>;namespace=<NAMESPACE>;action=remove"

then, dynamically, configure ship-only-specified-sets option for that namespace:

asinfo -v "set-config:context=xdr;dc=<DC_NAME>;namespace=<NAMESPACE>;ship-only-specified-sets=true"

and specify the sets to ship:

asinfo -v "set-config:context=xdr;dc=<DC_NAME>;namespace=<NAMESPACE>;ship-set=<SETNAME>"

finally, decide to rewind when (re-)associating the namespace to the DC.

asinfo -v "set-config:context=xdr;dc=<DC_NAME>;namespace=<NAMESPACE>;action=add;rewind=all";

Example:


asinfo -v "set-config:context=xdr;dc=REMOTE_DC_1;namespace=test;action=remove"

asinfo -v "set-config:context=xdr;dc=REMOTE_DC_1;namespace=test;ship-only-specified-sets=true"

asinfo -v "set-config:context=xdr;dc=REMOTE_DC_1;namespace=test;ship-set=demo"

asinfo -v "set-config:context=xdr;dc=REMOTE_DC_1;namespace=test;action=add;rewind=all";

Log output:

May 27 2020 17:20:39 GMT: INFO (xdr): (dc.c:464) DC REMOTE_DC_1 disconnected
May 27 2020 17:20:39 GMT: INFO (xdr): (dc_manager.c:460) DC REMOTE_DC_1 - removed namespace test
May 27 2020 17:20:39 GMT: INFO (info): (thr_info.c:3619) config-set command completed: params context=xdr;dc=REMOTE_DC_1;namespace=test;action=remove

May 27 2020 17:22:00 GMT: INFO (info): (ticker.c:423) {test} objects: all 510000 master 255201 prole 254799 non-replica 0
May 27 2020 17:22:00 GMT: INFO (info): (dc.c:1068) xdr-dc REMOTE_DC_1: time-lag 0 unprocessed 0 outstanding 0 complete (802402,0,0,0) retries (0,0) recoveries (0,0) hot-keys 0
May 27 2020 17:22:06 GMT: INFO (info): (thr_info.c:3619) config-set command completed: params context=xdr;dc=REMOTE_DC_1;namespace=test;ship-only-specified-sets=true

May 27 2020 17:22:40 GMT: INFO (info): (dc.c:1068) xdr-dc REMOTE_DC_1: time-lag 0 unprocessed 0 outstanding 0 complete (802402,0,0,0) retries (0,0) recoveries (0,0) hot-keys 0
May 27 2020 17:22:43 GMT: INFO (info): (thr_info.c:3619) config-set command completed: params context=xdr;dc=REMOTE_DC_1;namespace=test;ship-set=demo
May 27 2020 17:22:50 GMT: INFO (info): (ticker.c:423) {test} objects: all 510000 master 255201 prole 254799 non-replica 0

May 27 2020 17:23:04 GMT: INFO (xdr-client): (cluster.c:599) starting with seed nodes for REMOTE_DC_1
May 27 2020 17:23:04 GMT: INFO (xdr): (dc_manager.c:435) DC REMOTE_DC_1 - added namespace test
May 27 2020 17:23:04 GMT: INFO (info): (thr_info.c:3619) config-set command completed: params context=xdr;dc=REMOTE_DC_1;namespace=test;action=add;rewind=all
May 27 2020 17:23:10 GMT: INFO (info): (ticker.c:423) {test} objects: all 510000 master 255201 prole 254799 non-replica 0
May 27 2020 17:23:10 GMT: INFO (info): (dc.c:1068) xdr-dc REMOTE_DC_1: time-lag 323025788 unprocessed 0 outstanding 0 complete (832956,0,0,0) retries (0,0) recoveries (35992,1858) hot-keys 29913
May 27 2020 17:23:20 GMT: INFO (info): (ticker.c:423) {test} objects: all 510000 master 255201 prole 254799 non-replica 0

Notes

When using the rewind command for a specific namespace and destination DC, only non XDR client writes will be shipped by default. Records written by an XDR client will only be shipped if the forward configuration is enabled.

When shipping to a specific set, or for large namespaces, the use of asbackup and asrestore or a touch UDF to trigger normal XDR shipping may be a better solution depending on the overall use case.

Keywords

xdr5 rewind re-ship datacenter

Timestamp

May 27, 2020

Since the XDR rewind feature is added in version 5, is there a way/best-practice to “rewind” data for older versions when XDR is enabled?

This article should have some info: How to populate a remote namespace using XDR? (this can be tuned with filter expressions on the touch being done to select which records to replicate).