Exists in set query


#1

I need to design a system where I need to check if a key exists or not. Currently I am already using aerospike for the cause but my use case is no entirely satisfied by aerospike. I will have a key like this ( A group of 4 integers) –

For eg.

1-105-3-1133441 222-1000-1-1891893

(clientid-campaignid-nodeid-userid)

I just need to know if this key exists or not in the set. If exists return 1 else set the key and return 0. So that next time the query will return 1 This has to be a centralized service, like an aerospike server , because the checks/set will happen from multiple servers.

If a campaign is live the queries will be very dense ( upto 20k/s ). So the key lookup has to be in memory. But after that , for eg if no query in the last 5 minutes , all the data has to be retained on disk , so that memory can be freed up for other campaigns. I do not expect more that 20-30 campaigns to be live.

Whenever the next time an old campaign gets a query again , the server should bring it back to memory and serve the requests from memory. The first query will be slow because it has to kickoff a read from disk but next query onwards will be very fast.

I am using aerospike for now , but there is a problem that aeorspike stores all keys in memory. When it is practically useless to have old campaigns in memory unless they get triggered again

The total number of keys is practically unlimited. But let us consider 50 million keys per campaign and I may need to store 100k campaign data. Of which only < 50 will be live at any given point of time. The amount of old campaign data is just constrained by disk space


#2

You are storing a composite key with clientid-campaignid-nodeid-userid (a-b-c-d). So total number of keys are abc*d… correct? What, if any, is the data associated with this key?

What are the expected total number of each of a,b,c,d individually that you expect in your problem domain?

When you refer to 100K campaign data, I assume you mean total number of campaignid’s = 100,000. Of which only 50 are active at a given time. Correct?