Scan vs query with specified rps, filterExp and includeBinData = false

Hi there!

We are using Aerospike and have a task to return all user keys(the expected number is about 2 million) by filter (cityName = London) and then process them using Java code. I can see two ways to do it:

  1. via scan
ScanPolicy scanPolicy = new ScanPolicy(aerospikeClient.getQueryPolicyDefault());
scanPolicy.filterExp ="cityName"), Exp.val("London")));
scanPolicy.includeBinData = false;
scanPolicy.recordsPerSecond = 50000;

aerospikeClient.scanAll(scanPolicy, aerospikeTemplate.getNamespace(), "users-set", (key, record) -> executeSomeLogic(key.userKey));
  1. via query
QueryPolicy queryPolicy = new QueryPolicy(aerospikeClient.getQueryPolicyDefault());
queryPolicy.filterExp ="cityName"), Exp.val("London")));
queryPolicy.includeBinData = false;

Statement statement = new Statement();

RecordSet result = aerospikeClient.query(queryPolicy, statement);

result.forEach(keyRecord -> executeSomeLogic(keyRecord.key.userKey));

Do these two approaches have the same effectiveness and efficiency? Will the entire namespace be scanned in both cases? Is there any difference between them?

They should be the same, unless there’s some subtle code difference I’m not seeing or you have a SI. A query without a secondary-index (SI) created just creates a scan. It’s a little misleading isn’t it? For your second question, all records will be reviewed/scanned in that namespace unless you have a set index Set indexes | Aerospike Documentation . A set index would probably only be worth it to have if you need to scan a very small set in a large namespace (Say the set you want is 1K records but the entire namespace holds 1G).

Many thanks for your reply! Yes, as you you correctly noticed there is no SI in the query.