How to retrieve keys that best match a search query

query

#1

I want to be able to retrieve records based on how well the key matches a query string. For example, if my keys are full names (John Paul Smith) and the query string was “John Paul”, I would be able to retrieve all records whose keys contain the words “John” and “Paul”; as an added extra, it would be great if the results were ordered by best match. As in if the query string was “John Paul”, “John Paul Smith” would come before “Paul John Jones” because the former best matches the order of the words in the query string.

Is there any native support for searching for records based on which keys best match a query string?


#2

Two items to clarify - (1) Aerospike does not save the record’s key by default. Keys are only saved when client uses the policy “KEY_SEND” on write. Keys cannot be queried. They are only returned on full database scans.

(2) It is possible to query by value, by setting up a secondary index on a bin. Secondary index on string data only support equality query. It does not support text range query.


#3

Aerospike does not support partial matches on strings natively (without a filter). Simply a query we haven’t gotten around to - and yes, I know it’s pretty basic functionality. We continue to move forward… Are you talking specifically about strings, or are you talking about values in general ?

Aerospike DOES support range queries on integers. Should be easy to find in TFM ( look for “secondary index” and query). As wchu mentioned, not on the primary key - on column ( bin ) values.

Regarding sorting - remember that it’s a distributed system, so each server is returning a response. Building a sorted response requires a running merge sort among all the servers, which is nice from a programming perspective, but involves a lot of latency. Our queries tend to responses as soon as we get them, as many cases don’t require sorted responses. We’ll build that merge-sort functionality into clients at some point.

Now - the good news. Areospike supports any arbitrary query by using scan + filter ( or query + filter). In that case you build a UDF for your filter so it executes in-server during the query. Here is an example project with code:

http://www.aerospike.com/launchpad/query_multiple_filters.html

Hope that helps