Analyzing Writes performance


#1

Synopsis

These are notes regarding write performance troubleshooting through microbenchmarks analysis. This is for server versions prior to 3.9.

For versions 3.9 and above, microbenchmarks can be enabled at the namespace level. Refer to the Monitor Latency page for details.

Details

First, get an idea of the overall writes latency by running the asloglatency tool over a large time window to get an idea of the writes latency fluctuations:

sudo asloglatency -l aerospike.log -f head -e 1 -n 10 -h writes_master

At the bottom, you will get a summary providing the average and maximum for each slice as well as for the throughput. For example:

slice-to (sec)      1      2      4      8     16     32     64    128    256    512  ops/sec
-------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
04:20:55    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00      0.0
04:21:05    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00      0.0
-------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
     avg         0.07   0.01   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00      5.0
     max        22.32   3.26   0.99   0.40   0.22   0.00   0.00   0.00   0.00   0.00   8535.5

In the above example, there was a burst of writes peaking at 8.5k TPS with latencies going over 1ms for 22.32% of the write transactions. The average was very low (this example is taken from a specific short ran test in a development environment - steady production loads should have avg. and max. fairly aligned with each other).

Note that this histogram is for the time taken for the transaction only on server side. This does not include network time (data over wire) and client processing time.

You can then narrow down on the time window where the high latencies were happening to correlate with specific events at the system or at the application level. (Looking at the log file may be helpful as well, for potential warnings or specific build up on some of the internal queues, or corresponding start/end of cyclic events such as nsup).

The next step would be to enable microbenchmarks and analyze the histograms relevant for writes transactions:

  • Turn on microbenchmarks: microbenchmarks

    asinfo -h [host ip] -v "set-config:context=service;microbenchmarks=true"
    

    Note: Make sure to turn those off after the investigation as the extra logging this will generate may impact performance as well.

  • Analyze a specific time slice, histogram by histogram and identify histograms having high latencies. Let’s take a real example and walk through the histograms. For each histogram, run the following command against a log file where microbenchmarks have been turned on, with the same time window showing high latencies:

    sudo asloglatency -l aerospike.log -f 'Mar 01 2015 22:00:00' -d 3600 -e 1 -n 10 -h <hist_name>
    

    Note: This assumes high latencies between the hours of 22 and 23 (GMT) on March 1st 2015, and we will focus for this example on the 30 seconds around 22:10:00 GMT on that day.

    • writes_master: Overall time for writes transaction to complete. This does show high latencies as there is a large % number of transactions taking more then 1ms to be written, which on this particular workload should not be the case. (Of course, for some use cases, this may be acceptable).
    slice-to (sec)      1      2      4      8     16     32     64    128    256    512  ops/sec
    -------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
    22:10:07    10  25.55  15.90   5.33   0.40   0.00   0.00   0.00   0.00   0.00   0.00     99.4
    22:10:17    10  22.68  13.28   3.35   0.00   0.00   0.00   0.00   0.00   0.00   0.00     92.6
    22:10:27    10  30.19  22.55   7.16   0.12   0.12   0.12   0.12   0.00   0.00   0.00     83.8
    22:10:37    10  28.52  21.09   9.77   0.67   0.11   0.11   0.11   0.00   0.00   0.00     90.1
    
    • q_wait: Time spent in the main queue (q). In this example, there are no latency related issues. The throughput is much higher then the read throughput as this will also account for other transactions (for example read transactions).
    slice-to (sec)      1      2      4      8     16     32     64    128    256    512  ops/sec
    -------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
    22:10:07    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   2829.2
    22:10:17    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   3122.1
    22:10:27    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   2713.3
    22:10:37    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   2858.1	
    
    • writes_internal: Time spent executing the transaction on this node (master). For an ‘update’ write operation, this will include the time to read the record before updating it. In this example, the throughput matches the writes_master and again it all seems fine except a 0.11% tail in the 22:10:37 slice. This may be something to look at but not the cause of the large latency numbers observed in the writes_master histogram.
    slice-to (sec)      1      2      4      8     16     32     64    128    256    512  ops/sec
    -------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
    22:10:07    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     99.4
    22:10:17    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     92.6
    22:10:27    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     83.9
    22:10:37    10   0.11   0.11   0.11   0.11   0.11   0.11   0.11   0.00   0.00   0.00     90.0
    
    • write_storage_close: Time spent writing in the swb (streaming write buffer) in memory. Everything looks fine here.
    slice-to (sec)      1      2      4      8     16     32     64    128    256    512  ops/sec
    -------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
    22:10:07    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     99.4
    22:10:17    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     92.6
    22:10:27    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     83.9
    22:10:37    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     90.0    
    
    • wt_master_wait_prole: Time spent waiting for the prole write to happen, including network. This histogram pretty much matches the writes_master histogram and points at where the latency is coming from: waiting for prole writes to happen. This could be due to network latencies or internal issues on the nodes handling the prole writes. To check on potential issues on the node(s) handling the prole writes, check the writes_prole for the same time duration on this(those) node(s).
    slice-to (sec)      1      2      4      8     16     32     64    128    256    512  ops/sec
    -------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
    22:10:07    10  25.15  15.79   5.33   0.40   0.00   0.00   0.00   0.00   0.00   0.00     99.4
    22:10:17    10  22.46  13.28   3.24   0.00   0.00   0.00   0.00   0.00   0.00   0.00     92.6
    22:10:27    10  30.19  22.32   7.04   0.12   0.12   0.12   0.12   0.00   0.00   0.00     83.8
    22:10:37    10  28.19  20.75   9.66   0.55   0.00   0.00   0.00   0.00   0.00   0.00     90.1
    
    • writes_net: After everything is done, time spent to wrap up and put the response back on the wire for the client. Everything looks fine here.
    slice-to (sec)      1      2      4      8     16     32     64    128    256    512  ops/sec
    -------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
    22:10:07    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     99.4
    22:10:17    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     92.6
    22:10:27    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     83.8
    22:10:37    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     90.1
    
    • writes_prole: On the prole side, time spent doing the write (network not included). This would be for writes transactions initiated on a different node then this one. Everything looks fine here. Unless all other nodes have specific issues (for example at the hardware level), this now strongly suggests the issues as being on the network side, between the nodes.
    slice-to (sec)      1      2      4      8     16     32     64    128    256    512  ops/sec
    -------------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ --------
    22:10:07    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     94.4
    22:10:17    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     87.4
    22:10:27    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     87.6
    22:10:37    10   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00     85.0
    
  • In this example, we did narrow down to network issues between the cluster’s node (or at least some of them, and for sure enough to, on average, have a big impact in latencies). Not covered in this article are storage benchmarks which would then provide insight on latencies when accessing the underlying storage (at each device level).


Aerospike Performance issue and Understanding asloglatency tool
Reading Microbenchmarks
Aerospike is not horizontally scaling
"EOF" error and server performance degradation
Microbenchmarks Deprecated? What do I use instead