Aql select command encounters timeout issue when another scan job for different set is running

RockyTu · April 12, 2017, 7:08am

I meet this issue on Aerospike community version 3.8.3.

Aql select command encounters timeout issue when another scan job which scanning another set is running

issue : aql> select userId from test.dealer Error: (9) Timeout: timeout=1000 iterations=1 failedNodes=0 failedConns=0

but I select it using secondary index like following, it works fine.

aql> select userId from test.dealer where isDel=0
+-----------+
| userId    |
+-----------+
| "aAAAADs" |
| "cwAAAEc" |
| "ZgAAADg" |
| "bAAAAD8" |
| "RAAAAAY" |
+-----------+

5 rows in set (0.001 secs)

I do some simple test work, I found following conditions can cause this issue:

at that time, anther scan job ( which scan another set) is running and not completed.
select all on different set by AQL.

but if I use secondary index or primary key to query by AQL, it works fine. if I stopped the scan job, it works fine too. And if I scan the same set, it also work fine.

Sometimes, I change the timeout setting, the select AQL can work, but the performance cannot be acceptable, because the records in this set are not so many ( less than 10 records), 1 second is enough to finished this query.

So I turn to the community to resolve my issue, thanks.

Albot · April 12, 2017, 11:13pm

Scans can be slow. If waiting for it is not an option, use a secondary index.

RockyTu · April 13, 2017, 1:35am

But it seems multiple scans on multiple sets may block each other even though one set has very few records, is it a normal behavior or I have some configure issue? How can I improve it ?

Albot · April 13, 2017, 2:01am

Scans have to traverse the entire namespace and all records. Query does not. You can try increasing scan threads but they still will queue up and only process one at a time. If your intent is to fetch all the records in a set, I have found a query that matches all records to be much faster. For my specific use case, over 33,000 times faster. This was because I was fetching 23 records out of a namespace with about a billion records. The scan has to iterate through all 1 billion records, the query does not.

Albot · April 13, 2017, 2:04am

One cheap way, if you don’t already have a secondary index where you could potentially grab all records with a filter, is to add a single bin to all records with an integer “1”. Then you can query against this numeric bin with “1”. If you have an integer index already, just query that bin with range of Long.MIN and Long.MAX

RockyTu · April 13, 2017, 2:08am

Thanks for your clear explain and suggestion, it is valuable to my project.

Albot · April 13, 2017, 2:56am

No problem. Happy Aerospiking

Topic		Replies	Views
ScanAll and aql select timeout! query , go , aql	13	7410	December 1, 2017
Very slow fetching of records with scanAll() Java Client secondary , scan , spring , index	17	7597	March 2, 2015
Problem with multiple parallel scans Feature Discussion scan	17	4196	June 14, 2016
Simultaneous Scan of Set How Aerospike Works scan	4	1724	May 16, 2017
"Client timeout: timeout=2000 iterations=1 failedNodes=0 failedConns=0" err on scan request PHP Client Library	0	2291	December 23, 2015

Aql select command encounters timeout issue when another scan job for different set is running

Related topics