Hello, I am looking at the best option to apply SQL query (translated to API format) on the JSON documents stored in an Aerospike bin. The SQL query is dynamic and input by users.
The fields with in the JSON document can be different per record and could grow to around 100 in number. Across all records, the total number of distinct fields (keys) would be in the order of 10Ks.
Example json document)
{ “country” : “USA”, “state” : “California”, “pincode” : “98765” }
A sample SQL query is as given below.
Select * from set where country = “USA” and state = “California” and pincode in (“94912”, “98723”).
Note that the query could use multiple fields from the json document in the where clause.
The set itself could contain millions of records and we are looking at an average response time of 5-10 ms from Aerospike for the select query execution, for around 15K RPS at peak.
A typical response would have around 400 records in the result set.
I understand that secondary indexes can be created on the bin (keys of the Map). We are using Java client and would like to translate the SQL query as it is (if possible) to the API requirements as needed. If we have any examples which solve the above cases in a performant fashion, please share the same. Appreciate if you could also share the storage format, SQL query translation details as well.
If direct query execution (from client machine) is not a viable option, would a UDF based option work for the above performance expectations…? Please let me know.
Thank you!