FAQ - How can the performance of queries be monitored?
Within Aerospike there are macro-benchmarks which are written to the aerospike.log in histogram form by default. Within these macro-benchmarks there is a histogram called Query which gives performance of queries running within the cluster at a high level. In certain circumstances, it may be desirable to look at the performance of queries in more detail. What are the options for doing this?
There are a specific set of query-microbenchmarks that can be switched on at will. The command to do this is:
asinfo -v 'set-config:context= service;query-microbenchmark=true
For version 3.9 onwards, microbenchmarks can be enabled at the namespace level. The link below has the list of histograms possible. http://www.aerospike.com/docs/operations/monitor/latency#histograms
asinfo -v 'set-config:context=namespace;id=<namespaceName>;enable-benchmarks-read=true'
Once these are switched on the following histograms will be written into the aerospike.log.
|query_txn_q_wait_us||Histogram to track time spend in transaction queue if query_in_transaction is true|
|query_query_q_wait_us||Histogram to track time spend waiting in queue for the query thread.If the query queue is backing up it may be worth increasing the query threads in case the CPU is not fully utilized.|
|query_prepare_batch_us||Histogram to track the time spent while preparing query batches. Latency here could imply the secondary index is slow. It could also mean that the query batch (see Notes section) is too big.|
|query_batch_io_q_wait_us||Histogram to track the time spent waiting in the queue for worker threads.|
|query_batch_io_us||Histogram to track the time spent doing I/O per batch. This includes priority based sleep after n units of work. When latency is seen here, if more than 2 query worker threads are busy and if the system is not IO bound then try increasing the priority. The query threads may be yielding too much.|
|query_net_io_us||Histogram to track time spend sending results to client. When there is latency here it implies a network issue or a slow client|
- When a query returns n records it does not process them sequentially but rather in a batch. Each batch is a unit of work with a default size of 100 (controlled by the query-batch-size parameter). Histograms that refer to batch size are measuring the time taken to prepare or wait for the batch in either queues or i/o
- query-microbenchmarks is not a static parameter and cannot be included in aerospike.conf
- Any histogram where the name ends _us is measuring in microseconds, not milliseconds as is more common in benchmark histograms.
QUERY-MICROBENCHMARKS ANALYSE QUERY