Simultaneous Scan of Set

scan

#1

I am trying to scan a set (has ~3million entries) from two different threads (for some reason I need to scan the same set twice) but for one of the scan fails and the other one succeeds. In the log I can see the line May 10 2017 12:12:27 GMT: WARNING (scan): (scan.c::390) send error - fd 61 sz 370656 rv 192092

For a simultaneous scan of a set with ~100K entries both the scan completed.

Is there a way to make sure the two scans go through smoothly?

I am using Aerospike server 3.6.0 and Java Client 3.1.1


#2

In older versions of scan, if a client is too slow in processing results, causing the scan socket write has been idle for too long (10 seconds), the scan can be terminated.

With the latest server and client the write socket timeout can now be configurable.

[AER-5510] - (SCAN) Write idle-time-out now configurable from client.


#3

@wchu thanks!

Would you know if there is any way scan policy’s scanPercent option can help scan the whole set in chunks? Is it possible to scan the set incrementally in batches like 5 batches of 20% such that there is no overlapping record in the 5 batches?


#4

This can be achieved via the recently introduced predicate filtering capability, with a “digest modulo” filter.

http://www.aerospike.com/docs/guide/predicate.html

A C unit test example can be found here -

HTH


#5

Thanks @wchu, I’ll try this out.