How To identify read hotkeys from tcpdump


#1

How To identify read hotkeys from tcpdump

Context

Read hotkeys could be difficult to determine as there are no stats available to track concurrent read request for the same key under normal circumstances. There are planned enhancements to enable detailed logging to facilitate the tracking of hot keys, but this article focuses on the use of TCPdump to look for the Digest Id of a transaction along with linux tools like sort and uniq to get the highest used digest keys.

Capturing the 20-bytes transaction DIGEST ID with tcpdump in real-time:

Based on the pattern 00001504 and converting the tcpdump into HEX we can use the following script to capture the transaction digest in realtime , output to stdout and halt it with a CTRL-C:


sudo tcpdump -n -K -vvv -XX -s0 port 3000 and not port 22 -i any|egrep -v 'IP|length'|cut -d ' ' -f 3-10|xargs|sed -e 's/ //g'|grep -oE '\00001504[[:alnum:]]{40}'

Example:


vagrant@vagrant-ubuntu-trusty-64:~$ sudo tcpdump -n -K -vvv -XX -s0 port 3000 and not port 22 -i any|egrep -v 'IP|length'|cut -d ' ' -f 3-10|xargs|sed -e 's/ //g'|grep -oE '\00001504[[:alnum:]]{40}'
 
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
 
00001504f59124986e96ad175b374c9487945bbcad537b74
 
00001504f59124986e96ad175b374c9487945bbcad537b74

00001504f59124986e96ad175b374c9487945bbcad537b74
 
00001504f59124986e96ad175b374c9487945bbcad537b74
 
00001504f59124986e96ad175b374c9487945bbcad537b74
 
00001504f59124986e96ad175b374c9487945bbcad537b74

Output can also be saved to a file using command line timeout (60 seconds) and tee to a file (hotkeys.txt):

timeout 60 sudo tcpdump -n -K -vvv -XX -s0 port 3000 and not port 22 -i any|egrep -v 'IP|length'|cut -d ' ' -f 3-10|xargs|sed -e 's/ //g'|grep -oE '\00001504[[:alnum:]]{40}'|tee hotkeys.txt

The final step would be to sort the output and use uniq command line tool to get the digest of the keys that have been hit the most:

cat hotkeys.txt|sort|uniq -c|sort -n

Example:

    114 00001504463a09ea5b836a85aa6d3549b6bdbecc44e9fcf2
    114 0000150484c403a3a6a5e4796147200e539818d8e5f31ce5
    114 00001504daf9ff56e8e5df215fbb18c3977c94086fe61eea
    114 00001504ec72093b6a4f8d80ed63b1cb57bfc0ff1c4d8adc
    116 00001504122de28b8362f0c4e88b40b0a0b1307889020a69
    117 00001504a7263d75adcbf319f7d0a98623704cbf7bbd06a9
    123 00001504a7dadfe745cf6163369e84acfcb3a13207950963
    1008 0000150413c7017bbc4fbe2dd25a2ef7aa996e56cfcf9913

Capturing the 20-bytes transaction DIGEST ID from a .pcap file:

The following script can read a tcpdump pcap file in a directory and create an output file containing only the DIGEST ID captured:

Save all pcaps for affected server in a sub-directory and cd to that directory to run the folowing script:

for i in `ls|grep -v dlog`; do tcpdump -XX -s0 -r ${i}|grep -v '>'|cut -d ' ' -f 3-10|tr "\n" " "|sed -e 's/ //g'|grep -oE '\00001504[[:alnum:]]{40}' >> digest.dlog; done;

Use Wireshark and filter by the most used 20 bytes Digest key:

Use the following filter: data.data contains <XX:YY:ZZ:…:… >

where <XX:YY:ZZ:…:… > is a 20 bytes DIGEST

Check the timestamps and confirm that these requests occured in the same time range.

Example:

for i in `ls|grep -v dlog`; do tcpdump -XX -s0 -r ${i}|grep -v '>'|cut -d ' ' -f 3-10|tr "\n" " "|sed -e 's/ //g'|grep -oE '\00001504[[:alnum:]]{40}' >> digest_case_XXXX.dlog; done;
reading from file test.pcap, link-type LINUX_SLL (Linux cooked)
 
   1132 00001504f59124986e96ad175b374c9487945bbcad537b74
   

DIGEST should be the 20 bytes after 00001504.

Use wireshark and filter by the most used key to confirm rate of transactions for :

data.data contains f5:91:24:98:6e:96:ad:17:5b:37:4c:94:87:94:5b:bc:ad:53:7b:74

Notes

Above scripts can be modified to use specific ports and interface. It is recommended to check the digest in Wireshark and confirm the frequency of the transactions qualifies as a hotkey.

Keywords

Read hotkey tcpdump latency

Timestamp

05/15/2017