I had previously used Cassandra, and in that, it was advised not to use a secondary index on high cardinality columns. Does the same suggestion apply to Aerospike as well.
In my set, I have a bin which will have either 0 or 1 value. Basically 2 bins of the set(along with the PK) will be involved in my API.
Bin 1 contains the flag(integer) on which I want to use the secondary index. Bin 2 contains the data(data type is bytes). I’m storing C++ structure in it.
Basically, the use case is that I have to update the data in Bin 2(if it meets certain conditions). The purpose of the flag is to distinguish the records containing data in Bin 2 with the records which don’t have any data(Please note that the number of records in the set is very high.)
So, my API has to fetch the PK and Bin 2 of only those records whose flag is set. After that I’ll be doing further processing. Basically I’ll be updating maximum 2 bins in this API(not the whole record).
Is it fine if I index Bin 1? Are there any other performance optimal recommendations?