Not all records are returned on secondary index query during migration

I am performing few experiments with secondary index and node migration. I have about 800k records in a 3 node cluster with replication factor of 2. There is a integer bin having secondary index. I am running on Aerospike CE 3.10.0.3. Following are my observations -

  1. When node leaves the cluster (cluster size becomes 2) - Expected number of records are returned while executing the query on secondary index. Though from admin console I can see that data migration is ongoing while I am executing the query.

  2. When node rejoins the cluster (cluster size becomes 3) - I wait for migration to complete and then again start the 3rd node. Here again from admin console I could see that data migration is in progress. But, when I execute the same query on secondary index, I observe that number of records matching the query criteria has reduced. I also observe that this number of matching record grows over time and finally reaches the expected value.

I found an old discussion which was explaining this over here. It talked about qnode and designated master. But it’s written in this discussion that from version 3.7.0.1 qnodes are deprecated. Can someone explain me the expected behaviour ? And is there anyway to control it? I have some specific questions -

  1. When node leaves or rejoins, when does master node of record change? Does this change also wait for secondary index tree building to finish?
  2. As query on secondary index goes to every node, does each node reply with the matching records for which it is designated master?
  3. How does Aerospike client discover change in the ownership? I have read about “info message”, is that what is used?

The different scenarios are listed in the table on this documentation (under ‘Cluster State Changes’).

https://www.aerospike.com/docs/architecture/secondary-index.html#distributed-queries

1 Like

Thanks for the reference. As per the scenarios listed in that table, during Node joining, secondary query behaviour is “Best Effort”. As per explanation, possible impact during node joining is duplicate data, stale data. Is there a possibility of query returning less number of records than actual? Moreover, during Node leaving scenario, table says “Consistent Copy”. Does this mean, secondary query would work without any issue?