I want to understand secondary index little bit better, i have already read this and this and this and this but still have some questions.
if let’s say replication factor of data is 3, in a 5 node cluster. In steady state of cluster, there will be 3 nodes owning on partition (One primary and 2 secondary i guess?) In this case when a client does a query on secondary index, the query is forwarded to all the nodes (unlike in the case of primary index),
a. will the nodes who are primary owner of data only be building secondary index or one node will build secondary index for all the data owned by it?
b. If it is former, while forwarding the request to other nodes, is it told to node which partition to return data for? or one node will only return data for which it is the primary owner of partition? can secondary owner of a partition serve secondary index queries? In case partition is unassigned from a node the gc thread will take care of secondary indexes created on previously owned data?
c. if it is later, whenever partition rebalance happen indexes for a partition would be unavailable for a brief period of time?
d. In case partition is going on how would it change? It is mentioned in the above linked docs that some data may not be returned or may be returned twice, is it possible to get exception in these scenarios rather than getting what is available? We have a use case where if data is present we want secondary index to return it otherwise fail the request, is it possible to do query like this?
e. In what other scenario, secondary index will say a record is not there but record is actually present? We had an outage because of this where secondary index queries were not returning records. One node was unhealthy during that period is all we know for sure. we were on AS 4.x version, Not sure if bug mentioned here impacted us or not.
In recent client/server, both primary / secondary index queries are sent to nodes on a ‘per partition’ basis.
a. all replicas (master or not) will build the secondary index so that they can ‘take over’ if the master goes down.
b. for basic queries (meaning not background / aggregation) yes, the client will request specific partitions from the nodes who claimed master ownership. But a node that is not master but has all the data for a partition will honor the query and respond to it with the data it has. In terms of GC, yes, when a partition is dropped, the GC will go through to clean up the secondary index for that partition but this may change and be optimized to directly drop the secondary index for that partition (as the secondary index is also split by partition).
c. all replicas do have the secondary index and keep it up to date.
d. this is not the case as of Aerospike version 6. Since client will request on a ‘per partition basis’, if a partition is missing because its ownership changed when the query was ‘in flight’ and had hit nodes right at the bad time, the client will chase it down by retrying that specific partition against the node that would now own it. Yes, for older versions, there is a fail on cluster change policy… but again, better to use more recent version which will not miss any partition. (see this old KB article: Aerospike Customer)
e. if using the Enterprise Edition, best would be to open a Support case but that version is out of the maintenance/support window, so best may be to look at upgrading.
Thanks @meher for detailed response.
I do understand things are fixed in As version 6.x. Also can we assume secondary index response by default are consistent in AS 6.x?
can you please help us understand what to expect from AS version prior to 6.x? specifically AS 4.2?
In 6.X, queries will not miss any partitions due to cluster changes. In earlier versions, the failOnClusterChange policy can be used to get an error back instead of a query potentially missing partitions during cluster change (migrations when adding/removing nodes in a cluster).