Not all records are returned on secondary index query during migration

jin_dev · November 10, 2016, 1:41pm

I am performing few experiments with secondary index and node migration. I have about 800k records in a 3 node cluster with replication factor of 2. There is a integer bin having secondary index. I am running on Aerospike CE 3.10.0.3. Following are my observations -

When node leaves the cluster (cluster size becomes 2) - Expected number of records are returned while executing the query on secondary index. Though from admin console I can see that data migration is ongoing while I am executing the query.
When node rejoins the cluster (cluster size becomes 3) - I wait for migration to complete and then again start the 3rd node. Here again from admin console I could see that data migration is in progress. But, when I execute the same query on secondary index, I observe that number of records matching the query criteria has reduced. I also observe that this number of matching record grows over time and finally reaches the expected value.

I found an old discussion which was explaining this over here. It talked about qnode and designated master. But it’s written in this discussion that from version 3.7.0.1 qnodes are deprecated. Can someone explain me the expected behaviour ? And is there anyway to control it? I have some specific questions -

When node leaves or rejoins, when does master node of record change? Does this change also wait for secondary index tree building to finish?
As query on secondary index goes to every node, does each node reply with the matching records for which it is designated master?
How does Aerospike client discover change in the ownership? I have read about “info message”, is that what is used?

anushree · May 3, 2018, 9:20pm

The different scenarios are listed in the table on this documentation (under ‘Cluster State Changes’).

https://www.aerospike.com/docs/architecture/secondary-index.html#distributed-queries

jin_dev · May 7, 2018, 7:40am

Thanks for the reference. As per the scenarios listed in that table, during Node joining, secondary query behaviour is “Best Effort”. As per explanation, possible impact during node joining is duplicate data, stale data. Is there a possibility of query returning less number of records than actual? Moreover, during Node leaving scenario, table says “Consistent Copy”. Does this mean, secondary query would work without any issue?

Topic		Replies	Views
Secondary index missing records after migration (add/remove node) Query & Indexing secondary , index	3	1410	March 24, 2017
Aerospike secondary index availability	4	616	February 28, 2024
Behavior when using secondary indexes for reads Query & Indexing secondary , index	3	2054	June 24, 2015
Secondary indexes after adding a new node to the cluster secondary	7	1371	July 11, 2017
Less query result during node startup Query & Indexing query , secondary , index	7	2310	January 16, 2015

Not all records are returned on secondary index query during migration

Related topics