Hi all aerospikers ; - ) I’m in the process to choose a nosql db for a new project and I’m looking for some suggestions/ideas to best modelling my problem with aerospike, that so far seems the best solution.
I mainly have a big number (25 millions) of small entries (total around 150Gb), that reside in RAM (complete scenario is with a set of cluster). I’ve already a working prototype where I index these entries via a secondary index, piped to a map-reduce algorithm (wrote in lua+c) to filtering more the resulting entries.
The filtered entries (let me say, less than 1%) are used to retrieve additional datas recorded on a SSD disk, to narrow even more the search (applying a similar algorithm used in the first step)
My scenario is 100% read (write just at startup), first step is linear search on secondary index on RAM + map/reduce and second step is a (random) search from data (retrieved in first step) on SSD disk
So far i’ve been able to model my first namespace as a RAM table and a secondary namespace as DEVICE (SSD) table. The first step seems fast enough for my goal. Is there a good way to speed up the second step (reading randomly from SSD additional data), maybe sending in parallel more than one request (or maybe having a unique namespace with a table in RAM and another table on SSD - is it possible? seems not!)
Now i simply cycle the first step results, retrieving the primary key from RAM and using it to retrieve the additional data in the second table that reside on SSD.
Is there somewhere any entry point on the documentation that can address my problem?
Thank you! Angelo