How to enable XDR tracing dynamically


#1

How to enable XDR tracing dynamically

Context

Aerospike Cross DataCenter Replication (XDR) is one of the most powerful features of Aerospike Enterprise Edition. It allows geographically separate datacenters to replicate records giving near real time access to data on a global basis. XDR already logs aggregate information into log files in a configurable manner, however, at times it might be necessary to see exactly what is happening on a record by record basis. To do this the facility to trace the progress of records through the various sub-systems that make up XDR exists. This is XDR tracing. XDR tracing can be used as of version 3.10 for debugging purposes or to provide useful tuning information.

Method

The command to enable XDR tracing dynamically is as follows:

$ asinfo -v 'set-config:context=xdr;trace-fraction=10000'

In the example above XDR will pick 1 out of every 10,000 records at random and print detailed tracing information into the logs for those records. Sample output would look as follows.

Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5d75: received digest message at 12837
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5cb9: popped off queue at 12837
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5cb9: wrote to disk at 12839
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5cb8: read from disk at 12844
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5cb8: dispatching (1) at 12844
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5d95: scheduling transaction at 12844
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5d95: reading (start) at 12844
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5d95: reading (status: 1) at 12844
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5d95: queueing for REMOTE at 12844
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5d95: shipping to REMOTE at 12844
Sep 28 2016 04:31:01 GMT: INFO (xdr): (xdr.c:210) Tracing record
a872f878 on 5cba: shipped to REMOTE, status 0 at 12850

A key point to note is that the first 4 bytes of the digest of the record (a872f878) is printed within the log line. It is possible that messages for the various records to interleave, the digest can be used as an argument to the grep command to extract log lines for a specific record meaning the timeline can be shown as above.

In the above example, logging starts at time 12,837, which is a millisecond timestamp. This is when the XDR subsystem gets the record digest from the write subsystem. At 12,839 (2 ms later), the digest and some other metadata is written into the digest log. At 12,844 (7 ms later), the digest log entry is read from the digest log and as the entries continue the full process is exposed. Ultimately, at 12,850,the remote cluster has responded that the record was successfully written. From the log entries it can be seen that the whole processing took 12,850 - 12,837 = 13 ms.

To turn this feature off, use the following command:

$ asinfo -v 'set-config:context=xdr;trace-fraction=0'

Notes

  • Info command reference

http://www.aerospike.com/docs/reference/info

Keywords

XDR TRACE DYNAMIC

Timestamp

9/29/16