Are XDR replicated records returned in queries?

HimanshuKumarSingh · February 2, 2018, 6:53am

I have a ‘Simple Active-Active’ setup of 2 aerospike clusters with each cluster setup to XDR replicate its records to the other cluster.

I need only the counts of the master records(non-XDR replicated records from each cluster). Can I do that using the java client api/ udf aggregation?

2.This is the reverse of Q1: can I fetch the counts of the master records as well as the XDR replicated records from the cluster?

sunil · February 2, 2018, 7:32am

A2. Let me answer your Q2 first as its simple and straight forward. In the logs we continuously print the master objected count for each namespace. The same info can be obtained via namespace stats too (statname : master_objects). You just need to sum up those values across all the nodes in the cluster.

A1. We do not maintain information if a record is written by XDR or by the client. But we have namespace stats which tell number of writes by clients+xdr (client_write_success) and number of writes by XDR alone (xdr_write_success). Note that this does not represent the object count. there can be 1000 writes but only 1 object. I am not sure if the stat will serve your purpose.

HimanshuKumarSingh · February 2, 2018, 7:59am

Thanks Sunil - that was a crisp answer. For A2, can I fetch those stats using the java API?

sunil · February 2, 2018, 1:20pm

Yes, you can use the Info API in Java. The command string is "namespace/<nsname>". Replace <nsname> with actual namespace name. You can request() from one/multiple/all nodes of cluster.

If you want this only for information purpose, you can use our tools to get this information. In general, be advised that its not a good idea to integrate these stats in your core application logic. It is rare, but we may change stat names or their semantics over time. Moreover, your application logic should not depend on the stats.

HimanshuKumarSingh · February 6, 2018, 2:03pm

Sorry, but I come back with another related problem. My problem is no longer limited to simply counting the number of records.

I have a common back-end application node that queries both the clusters in my ‘Simple Active-Active’ setup and then combines the results and sends back combined results back to the API client. The problem, as you would know, is that I get two copies of the same record.

Q1) Is there any way in which I can have my queries on each cluster consider only the master records on that cluster? Does Aerospike give me any customization hook just before XDR copy at the source cluster or before writing at the destination cluster so that I can set a bin in my replicated record(s) to distinguish it from the master records. I can then filter on this bin to return only the master records.

Secondly, I am curious - for the Simple Active-Active XDR replication setup:

Q2). If, on a cluster, Aerospike does not differentiate between the master record and a record written by client, then is it possible to update the XDR copy of a record on the XDR destination cluster?

Q3) I think it is possible to have sets with same names and within same named namespaces in both the clusters. If it is so, is it possible that 2 records in the same set, one in each cluster, have the same PK? If that’s also true, what will happen when each of these 2 records are replicated on the other cluster… Is there a possibility of a clash of PK?

sunil · February 7, 2018, 7:51am

I am not sure why your application is reading replicated data from 2 clusters and merging the data. In general, this idea does not look good to me. The application should know which data is replicated and which one is not. I can understand if you are merging unreplicated data. In XDR you can configure only to replicate some sets and not others. So, your application can organise data in sets and deal with replicated vs non-replicated sets differently. I am not fully aware of requirements, but I feel that it’s worth relooking at your design. You should look at XDR as a data sync mechanism rather than master and slave/replica copies.

Giving answers specific to your questions.

Q1. No hooks are provided to manipulate data before replication.

Q2. Yes, its like any normal record. Your application can read/update the record on destination.

Q3. Yes. We call it write-write conflict. In an active-active setup, if the same record is updated in both the clusters at the same time, this conflict may arise. We do not resolve the situation automatically. Depending on the timing, the last record version to be shipped will survive on both nodes or two different copies may be shipped to each other. The common work around is to aovid write-write conflict from application layer by having key affinity to a cluster.

HimanshuKumarSingh · February 7, 2018, 9:04am

Its a PoC kind of an application right now. I will design the real application in a way that is consistent with this discussion. Thanks for your inputs

system · February 13, 2018, 9:04am

This topic was automatically closed 6 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Monitoring XDR performance Monitoring	2	1708	July 17, 2017
Number of records shipped for a particular namespace XDR (Cross Data Center Replication)	3	1208	February 8, 2018
CrossDatacenter Replication (XDR) XDR (Cross Data Center Replication)	1	1970	September 5, 2015
XDR and conflict resolution XDR (Cross Data Center Replication)	10	3797	July 7, 2021
Remote Cluster in XDR Growing and Not Expiring As Expected...I Think XDR (Cross Data Center Replication) xdr	2	2680	December 3, 2016

Are XDR replicated records returned in queries?

Related topics