Your requirements seem to be quite relaxed.
The simpler approach and the one you should first evaluate/benchmark IMHO is to use secondary indices with built-in query() feature. Queries only benefit from 1 secondary index filter - therefore highest possible selectivity is desirable. You can apply more filters later on but initial stream of documents that are read from storage is determined only by set and up to one secondary index filter. (Source: http://www.aerospike.com/launchpad/query_multiple_filters.html Abstract “Discussion”).
From what you said it might be possible that you have one “set” (a.k.a table) per client (up to 1024 sets allowed with AS) and one further drilled down index filter like last_seen. If you can get from 3M down to 500k it’ll be a huge gain (partitioning ftw). I’d tend to go with a time value for the secondary index thing but only you know what’s best for your app.
Assuming you are running on SSDs, you might be able to speed this approach further up by configuring “data-in-memory” which will keep a read-only copy of your data in memory that could be used on server-side for that secondary filtering stage. With 200M records of 1K size (?) you’ll end up at 200 GiBs. If you keep secondary indices for the 2-3 most commonly used filters, that will add another 25 GiBs or so for secondary indices. Doable. Depends on how big your customer records are.
However, you’ll have to benchmark this approach with the different configurations and see whether this can fit your time window of 2-4 secs. I remember I once heard an rule of thumbs that queries can evaluate 200k entries/sec on bare metal per node, but please don’t trust this number at all.
To sum it up. Your records would look like this (pseudo names - might be too long!):
Set: ClientX_CustomerSet Key: Customer_1234567
Bins: {
"lastseenbin": 123456789, // Each set has an own secondary index tree on this value.
"agebin": 30, // .. and maybe another one on this one too. Should have atleast 1 useful index for any possible query.
"locationbin": "UK",
"coat_sizebin": "XL",
// ... dynamic set of tags and values as bins (per-customer)
// and your normal data for customer (or lookup from another record by a 'FK' here):
"name ": "John Doe",
"email": "johndoe@example.org",
}
Pseudoquery:
Aggregate yourFurtherFilterFunction(queryparams like age=30,location=UK) on yourNamespace.ClientX_CustomerSet where lastseenbin between 120000000 and 199999999
Don’t be scared by the lua code examples in the article linked above. You’ll see it is a perfect fit once you socialize with the philosophy behind this powerful feature.
An alternative approach is to keep your own kind of index with large ordered lists (“LList”) but this is a lot more work though it might perform better in the end. It requires your app to manually do the bookkeeping. You have records like “ClientX_CustomersAge30” and inside 1 bin that contains a list of all customer IDs (record digest or self-autoincremented integer uids) with that value. When you want to perform a query, you get (“scan” or “find_N”) such a list and e.g. merge it with entries from a list representing all customers with coatsize XL. You get the idea. After filtering took place, you batch_get details for all customers you want to display on page 1. Ranging() is possible too on lists, but you need unique keys for the sorted large ordered lists so you might have to become creative (like fusing an age value (8 bits) with random 56 bits).
This approach is kinda limited on the amount of throughput since you have write amplification (might have to touch 50 lists in your case). That’s why I asked for the expected throughput on changes… This works better for static data sets. It’ll be a pain to manage from the app-logic but it might allow for drastically faster responses and higher query throughput. Not sure when we will have a chance to implement this and comment on it’s performance…
But the general answer is: it’s possible to do such things but it’ll require some effort / creativity with any NoSQL-DB that hasn’t been fully optimized for these queries in their designs.
Hope this input is helpful. The team might be able to tell you more about expected throughput with the first approach or whether I got something wrong from all the doc materials out there. With your approach I see a problem because you are mixing up secondary index values from all bins/dimensions (?). Biggest problem is that an secondary index can either contain integer or string values, not both at the same time. You’d have to create a record for every value per customer as well, resulting in wasting primary index memory (64 bytes per record!). This way you’d easily use 512B-1K per customer just for indices. Or did I get your idea wrong here?
Cheers,
Manuel