Bulk read and filtering capabilities

Sriram · November 24, 2015, 4:26pm

we are evaluating aerospike for a mobile advertising use-case, and have some questions around the ‘c’ client api’s.

No real capability to get partial response. I want to be able to attach SLAs to these different batch invocations (a number of keys from different sets within a namespace), and run with whatever I could get within that time. This doesn’t seem to be possible. Every batch invocation has a single global timeout (which could be tweaked), and thus it is all or nothing. There seems to be no streaming support for responses, although invocations have callback to propagate responses
The LUA capabilities seem to be highly restrictive. LUA modules can be specified when queries are invoked (aerospike_key_apply/as_query_apply etc.). However:

Queries have to select bins and a single where clause (we want to be able to say select some_bins from ns.set where rate <> 1 and status = 1)
The where clause predicate is limited - equality and range checks are the only ones supported (NOT ‘IN’ clauses, which are useful for us). For instance, select some_bins from ns.set where ids in (a, b, c…)
Keys can’t be specified with batch requests and LUA filter, and thus clients have no clue where to direct this request. Hence, it is fanned out to the cluster
Data types in predicates can only be strings or ints
LUA modules can be specified when single keys are queried However, making multiple invocations (for each key) and filtering just a single row response isn’t scalable

our use-case is something like this:

a. pass a set of entity id’s to the as cluster and return back a fraction of them that satisfies some condition (i.e. bin_1 == something && bin2 == some_other_thing)

b. using this set, fetch some more information from another set were bin_1 contains (something from a set that clients will pass) and bin_2 does not contain (something from a set that client will pass etc.)

we can fetch all this data and do the filtration on the client. however, the sheer volume of data that is coming back simply saturates network.

wchu · November 30, 2015, 9:38pm

Your assessment is correct that currently batch read does not perform filtering nor UDF. The alternative is to make single-record reads, and UDF filtering can be done.

Topic		Replies	Views
Accessing C library from lua script in client.queryAggregate User Defined Functions (UDF)	1	2258	January 22, 2016
Arglist to a lua function C Client Library	3	2326	April 30, 2015
Only single thread spawned querying Aggregate stream udf from Java client Java Client	1	786	February 19, 2020
How to do multiple filtering Go Client	6	4803	May 20, 2015
Best way to implement batch reads in aerospike	0	776	February 5, 2018

Bulk read and filtering capabilities

Related topics