Monitoring Scans

Monitoring Scans

How should I monitor a scan jobs?

You can use this command for all the nodes :

aql -c "show scans"

For one single node, you can run this command:

asinfo -v "scan-list"

Here is an example of an active scan job:

------+
| active-threads | ns     | recs-failed | recs-succeeded | recs-filtered-bins | trid                   | job-progress | set    | priority | job-type | module | recs-throttled | recs-filtered-meta | status                             | run-time | net-io-bytes | rps     | time-since-done | socket-timeout | from               |
----------------+--------+-------------+----------------+--------------------+------------------------+--------------+--------+----------+----------+--------+----------------+--------------------+------------------------------------+----------+--------------+---------+-----------------+----------------+--------------------+
| "4"            | "test" | "0"         | "195843"       | "0"                | "1191023538124365945"  | "98.00"      | "set5" | "0"      | "basic"  | "scan" | "195843"       | "0"                | "active(ok)"                       | "254526" | "4502992"    | "50000" | "0"             | "30000"        | "172.17.0.2:32950" |

For a finished scan job:

| "0"            | "test" | "0"         | "200000"       | "0"                | "1191023538124365945"  | "100.00"     | "set5" | "0"      | "basic"  | "scan" | "200000"       | "0"                | "done(ok)"                         | "265691" | "4602750"    | "50000" | "5085"          | "30000"        | "172.17.0.2:32950" |

For a failed scan job where the client side transaction is too slow or terminated, the status is abandoned-response-timeout:

| "0"            | "test" | "0"         | "34593"        | "0"                | "8854145309018774112"  | "100.00"     | "set5" | "0"      | "basic"  | "scan" | "34593"        | "0"                | "done(abandoned-response-timeout)" | "6372"   | "791398"     | "50000" | "973"           | "30000"        | "172.17.0.2:34598" |

It is possible for the client to still have the transaction running but just too slow in that situation, it will get a java.ioEOFException as the server already closed the connection. This would happen when the client sets its socketTimeout to 0, causing the server to override this with its default of 10 seconds.

What does an rps of 0 mean?

That means there is no throtlling at all. It will use all the threads until it reaches the configured single-scan-threads limit. If no rps is specified, it will default to 0 and the scan will run as quickly as possible.

Why does the scan appear to be slow even when rps is 0?

There are many factors that can contribute to a slow running scan:

  • Disk I/O bottleneck.
  • A slow client that cannot consume the returned data fast enough can slow down the scan. For example, you can artifically slow the callback function in Java and monitor the scan job.
  • Data being scanned is too scarce (for example when scanning a small set in a large namespace).

References

Keywords

SCAN JOB STATUS RPS

Timestamp

July 2020

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.