How does Aerospike scan data efficiently using range filter?


#1

One key design of aerospike is using hash key to distribute requests evenly to all nodes in the cluster. As a HBase user, I think this is a good design to avoid hot point. But come with this design, the key are not stored physically in sequence and the scan will have to go though all keys of a set.

But from the Java Client API, I found range filter api like:

  • range(String name, long begin, long end)
  • range(String name, Value begin, Value end)

Wondering how range filter supported in a hash-key-store cluster.

  • What is the start and end means here? Is it the origin key of record? How keys are compared? ( It need to be compared and sorted to have a sequence so that start and end can be used, right?)
  • And how efficiently this range filter is? How could it works on a distributed-hash key set?

Thanks!


#2

You’re probably looking for the Query feature: http://www.aerospike.com/docs/guide/query.html

Aerospike allows for secondary-indexes which you can define to then run equality queries (int, string) or range queries (int) on bin values.

There are also LDT or Large-Data-Type records that can support range queries on the List LDT. More about that here: http://www.aerospike.com/docs/guide/ldt.html


#3

got it. So the start and end is for the secondary index and is for the bin’s, not for primary index.

Thanks for your answer!


#4