Secondary index with low cardinality or huge fanout

helper_bro · March 24, 2023, 12:42pm

We are modelling user wishlist on Aerospike (just for an example to explain my usecase). I will have records like userID → productid in aerospike.

whenever a product comes in stock, we need to send notification to all the users who have wishlisted a product. To cater to this use case we are planning to have a secondary index on productid. but there can be certain products that are very hot and have millions of users wishing for it.

Are there any considerations we have to do? or just creating a secondary index will work? the key metrics here is time taken in fetching all the userIds from AS.

This is more of a data modelling problem, but i did not find any resource which talks about this.

neelp · March 27, 2023, 4:39am

A secondary index with low selectivity is generally not recommended. With low selectivity, a secondary index query can still be faster than the alternative of a full scan, however, the cost of additional memory and index updates need to be considered.

Also, a query returning millions of results will be slower even if selectivity is relatively high, for example, if the total number of users are in the hundreds of millions.

Some factors to consider include:

unique products on wish lists at a time
are hot products the norm or most products have a small number of interested people
the fraction of the total number of users choosing a hot product
max size limit on a user’s wish list

If most products are not hot and most queries do not involve hot products, a secondary index makes sense. For scenarios that are not clear cut, it is best to try it out in a real or representative environment. Hope this helps.

system · March 26, 2024, 4:39am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
When not to use a secondary Index index	13	1509	June 29, 2018
Query by secondary index performance Query & Indexing secondary , index	4	5683	August 5, 2015
Storing queryable tags against a user profile Data Modeling	8	3699	January 14, 2016
Secondary Index query performance Query & Indexing query , scan , index	7	211	April 11, 2024
Query optimization for multiple filter query , secondary , index	2	1780	June 20, 2017

Secondary index with low cardinality or huge fanout

Related topics