Chaining multiple filters


#1

Recently I started to move some java code to aerospike. I began to review the query-with-filters sample.

stmt.setFilters(Filter.equal("username", Value.get("Mary")));

resultSet = client.queryAggregate(null, stmt, 
    "profile", "check_password" , Value.get("ghjks"));

I assume that the client code filter (username) executed before the UDF filter (password).

In my code I got a similar part. Filters needed are:

Sequence: FIlter1 -> Filter2 -> ResultSet

Filter1: ValueA == “” (many records in relation to the whole set) Filter2: ValueB Between 0 AND 10 (only a small part of the Filter1 result would NOT match)

What would be the best method of chaining such filters? Should the client (Filter.range/equal) or the UDF filter do the largest reduction of the result set (Filter1)? What sequence would offer the best performance?

Or is it possible to apply all filters in UDF without a client filter in the queryAggregate call similar to following code?

return stream : filter(filter_1) : filter(filter_2) : map(map_record)

Thanks in advance!


#2

There are two filters in discussion here. I will term them: (1) Predicate Filter (2) Aggregation Filter.

Predicate Filter should be the filter which narrows down the most amount of record, for the namespace/set in question. The bin would require a secondary index created, and a query to narrow down to the appropriate data-set will be very quick.

Aggregation Filter would be applied to the result-set of a Predicate Filter.

For example, if one is looking for “all female in the age of 20-23”, it would be better to:

  • Have a secondary index on “age”.
  • Then have the Predicate Filter be “range between 20 to 23” (assuming persons of all age exists, and that persons of age 20-23 is a small portion of the whole data-set).
  • Then have an Aggregation Filter on “gender”, which would only return records with “gender=female”.

Hope this helps.