Aql select command encounters timeout issue when another scan job for different set is running

I meet this issue on Aerospike community version 3.8.3.

Aql select command encounters timeout issue when another scan job which scanning another set is running

issue : aql> select userId from test.dealer Error: (9) Timeout: timeout=1000 iterations=1 failedNodes=0 failedConns=0

but I select it using secondary index like following, it works fine.

aql> select userId from test.dealer where isDel=0
+-----------+
| userId    |
+-----------+
| "aAAAADs" |
| "cwAAAEc" |
| "ZgAAADg" |
| "bAAAAD8" |
| "RAAAAAY" |
+-----------+

5 rows in set (0.001 secs)

I do some simple test work, I found following conditions can cause this issue:

  • at that time, anther scan job ( which scan another set) is running and not completed.
  • select all on different set by AQL.

but if I use secondary index or primary key to query by AQL, it works fine. if I stopped the scan job, it works fine too. And if I scan the same set, it also work fine.

Sometimes, I change the timeout setting, the select AQL can work, but the performance cannot be acceptable, because the records in this set are not so many ( less than 10 records), 1 second is enough to finished this query.

So I turn to the community to resolve my issue, thanks.

Scans can be slow. If waiting for it is not an option, use a secondary index.

But it seems multiple scans on multiple sets may block each other even though one set has very few records, is it a normal behavior or I have some configure issue? How can I improve it ?

Scans have to traverse the entire namespace and all records. Query does not. You can try increasing scan threads but they still will queue up and only process one at a time. If your intent is to fetch all the records in a set, I have found a query that matches all records to be much faster. For my specific use case, over 33,000 times faster. This was because I was fetching 23 records out of a namespace with about a billion records. The scan has to iterate through all 1 billion records, the query does not.

One cheap way, if you don’t already have a secondary index where you could potentially grab all records with a filter, is to add a single bin to all records with an integer β€œ1”. Then you can query against this numeric bin with β€œ1”. If you have an integer index already, just query that bin with range of Long.MIN and Long.MAX

Thanks for your clear explain and suggestion, it is valuable to my project.

1 Like

No problem. Happy Aerospiking :slight_smile: