Sorting a Very Large set repeatedly

izmaya01 · June 17, 2017, 9:16pm

Hi,

I’m developing an application that might potentially store billions of records in a single set on our Aerospike cluster.

A very common use case will be fetching the top n most popular records from the DB.(We can expect such calls almost every second).

From what I understand, the only way to achieve this will be using some UDF that sorts by popularity and limits the result.

So, my question is what will be the impact on performance? Might this cause Aerospike to hang when sorting so many records repeatedly every few seconds? If yes, is there a better alternative available?

Thanks.

Albot · June 18, 2017, 1:43am

These top n records, how many different categories of top records do you need? I’m sure you saw this already http://www.aerospike.com/blog/top-10-aerospike-aggregations/ but wondering if maybe you might be able to just run this and store the results back into a map record. Then, if a user wants to get the current top results, they can retrieve the map record which is updated periodically. This way the consumers of the top results don’t have to all run the aggregation, only 1 program does.

Topic		Replies	Views
Pagination in fetched result set from Aerospike (AER-5474, AER-6193) Delivered Requests	14	8098	June 16, 2020
Select by index and sort	0	1528	January 10, 2017
Creating a leaderboard with getting rank of a particular record User Defined Functions (UDF)	4	2073	February 9, 2018
Find Top N record from stream User Defined Functions (UDF)	3	2633	January 23, 2015
A way to strore large data in many sorted maps	0	1035	January 10, 2017

Sorting a Very Large set repeatedly

Related topics