How to populate a remote namespace using XDR?

How to populate a remote namespace using XDR ?

Context

When a new remote destination namespace is added to a cluster (XDR configuration) it is necessary to populate that destination with data. This article discusses the two methods that are available within Aerospike 5.x and higher to achieve this.

Method 1

The first method available uses the XDR rewind functionality. Rewind can be used to ship the whole namespace, only a specific set or it can be done from a specific time stamp . The How to Rewind XDR for a Namespace article describes how to use rewind.

When using the rewind feature, XDR will reduce the namespace partitions and ship records as it encounters them. This means that the shipping volume is controlled by the overall XDR velocity controls.

This implies that the following points should be considered:

  • If the xdr ship latency is constant i.e latency being close to the actual network link latency between source and destination, then the throughput can be increased by increasing the xdr max-throughput parameter (if it was previously configured). The XDR latency_ms metric can directly be monitored across a cluster through the info xdr command on asadm.

  • If latency_ms starts increasing, it indicates that the network or destination is likely being overloaded. At this point, the XDR throughput should be reduced via the max-throughput configuration parameter.

Method 2

The second method is to use a Scan & Touch UDF. This is a UDF which is used to touch the records in a set / touch those records which are updated after a certain LUT. When the record is touched it will automatically be added to the XDR transaction queue for subsequent shipping. A touch does not affect the contents of the record, but will update its metadata (specifically, the LUT, at the record level as well as at the bin level, if configured through the conflict-resolve-writes parameter). The UDF scans and traverses the whole index and processes either all records, those in a specific set or those with a specific LUT. Aerospike Expressions can be used with UDFs.

Using this UDF method, the XDR shipping rate can also be controlled using the max-throughput parameter. Additionally, the single-scan-threads and background-scan-max-rps can be used to control the pace of the scan itself.

As the second method uses the same XDR controls as before but can also throttle using the scan controls described above, this means that there is slightly more control over shipping velocity, but at the price of tempering with the record’s LUT(s).

Notes

  • It is possible to use backup and restore to create a baseline data load for the remote cluster and use XDR to migrate the delta in real time. This method is discussed in this knowledge base article.

Keywords

XDR 5.X, MIGRATE POPULATE DATACENTER ASRESTORE REWIND SCAN TOUCH

Timestamp

April 2021

© 2021 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.