Analysing Logs with the Aerospike Admin Tool

Aerospike_Knowledge · May 6, 2016, 11:01pm

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

Install

To install Aerospike Admin either install the Aerospike-Tools package which is bundled in our Server Packages or clone and install from the Github Repository.

Starting asadm with log analyzer mode.

Run asadm -l to get into log analyzer mode.

Description.

This document is for analysing server logs for asadm 0.1.7 (pre-3.12.0 tools) or earlier. In general, you should use the latest asadm which should work on all versions of server logs.

The syntax has been changed. For latest documentation, refer to our official web page: http://www.aerospike.com/docs/tools/asadm/user_guide/log_analyser_mode_guide

1. Adding server logs from a directory.

Admin> add server node /media/5255/
2 server logs added for server analysis.

Admin> list

*************************** CLUSTER ***************************
*************************** SERVER ***************************
1  : node0	/media/5255/tmp/as_log_1460944686.36/aerospike.log
2  : node1	/media/5255/tmp2/as_log_1460945660.84/aerospike.log

Notice how logs from within 5255/tmp/* are added. Doesn’t have to be complete path. Tool is smart enough to pick only the aerospike logs.

You can add logs from 2 different directories in different paths.

Admin> add server day1 '/media/5255/' log '/media/5285/'
2 server logs added for server analysis.
5 server logs added for server analysis.

Admin> list
*************************** CLUSTER ***************************
*************************** SERVER ***************************
1  : bb930a336c90200	/media/5285/bdclogs-03122016/aerospike.log.s01235bdc
2  : bb96152cb05be24	/media/5285/bdclogs-03122016/aerospike.log.s01238bdc
3  : bb99152cb05be24	/media/5285/bdclogs-03122016/aerospike.log.s01236bdc
4  : bb9b152cb05be24	/media/5285/bdclogs-03122016/aerospike.log.s01237bdc
5  : bb9e152cb05be24	/media/5285/bdclogs-03122016/aerospike.log.s01239bdc
6  : day10	/media/5255/tmp/as_log_1460944686.36/1aerospike.log
7  : day11	/media/5255/tmp2/as_log_1460945660.84/aerospike2.log

For log file of server (version >=3.7.1), asadm fetches node id from log and set it as a display name.

2. Listing added server logs.

Admin> list

*************************** CLUSTER ***************************
*************************** SERVER ***************************
1  : node0	/media/5255/tmp/as_log_1460944686.36/aerospike.log
2  : node1	/media/5255/tmp2/as_log_1460945660.84/aerospike.log

The first column is the log file number.

3. Removing the server logs.

Admin> remove server -a '1,2'
Admin> list

*************************** CLUSTER ***************************
*************************** SERVER ***************************

Removes all server logs from the list.

4. Selecting specific log files to analyse.

Admin> select server 1
Admin> list selected

*************************** CLUSTER ***************************
*************************** SERVER ***************************
1  : node0	/media/5255/tmp/as_log_1460944686.36/1aerospike.log

You can select individual server logs to work upon once added. By default operations will be on all server logs unless selected.

5. Grep for strings.

Admin> loggrep server show -i -s 'warning'

************************** cluster **************************
[node0]->::Apr 23 2016 21:38:16 GMT: WARNING (cf:socket): (socket.c::371) Error in delayed connect() to 12.34.56.78:3002: timed out
[node0]->::Apr 23 2016 21:38:16 GMT: WARNING (cf:socket): (socket.c::371) Error in delayed connect() to 12.34.56.78:3002: timed out
[node0]->::Apr 23 2016 21:38:27 GMT: WARNING (cf:socket): (socket.c::371) Error in delayed connect() to 12.34.56.78:3002: timed out

Notice how grep is only working on node0 and not on node1. This is because node1 is not selected.

Once node1 is also selected.

Admin> loggrep server show -i -s 'warning'

************************** cluster **************************
[node0]->::Apr 23 2016 21:41:58 GMT: WARNING (cf:socket): (socket.c::371) Error in delayed connect() to 12.34.56.78:3002: timed out
[node1]->::Apr 23 2016 21:41:58 GMT: WARNING (cf:socket): (socket.c::371) Error in delayed connect() to 12.34.56.78:3002: timed out
[node0]->::Apr 23 2016 21:42:08 GMT: WARNING (cf:socket): (socket.c::371) Error in delayed connect() to 12.34.56.78:3002: timed out
[node1]->::Apr 23 2016 21:42:08 GMT: WARNING (cf:socket): (socket.c::371) Error in delayed connect() to 12.34.56.78:3002: timed out

Example-1: Grep for multiple strings. (OR operation):


Admin> loggrep server show -i -s 'proto' 'clustersize'

************************** cluster **************************
[node0]->::Apr 24 2016 03:12:11 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 0 , 0 ) ::: ClusterSize 26 ::: objects 45473451 ::: sub_objects 0
[node0]->::Apr 24 2016 03:12:11 GMT: INFO (info): (thr_info.c::4828)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (189, 5995, 5806) : hb (50, 50, 0) : fab (354, 354, 0)
[node1]->::Apr 24 2016 03:12:12 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 1 , 0 ) ::: ClusterSize 26 ::: objects 44224075 ::: sub_objects 0
[node1]->::Apr 24 2016 03:12:12 GMT: INFO (info): (thr_info.c::4828)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (193, 6035, 5842) : hb (50, 50, 0) : fab (354, 354, 0)

This will grep for either of the strings proto OR ClusterSize. You can give multiple arguments like this. The number of strings to grep is unlimited.

Example-2: Grep for multiple strings. (AND operation using -a)

Admin>loggrep server show -i -a -s 'trans' 'proto'

************************** cluster **************************
[node1]->::Apr 24 2016 02:59:42 GMT: INFO (info): (thr_info.c::4828)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (188, 5186, 4998) : hb (50, 50, 0) : fab (354, 354, 0)
[node0]->::Apr 24 2016 02:59:51 GMT: INFO (info): (thr_info.c::4828)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (184, 5174, 4990) : hb (50, 50, 0) : fab (354, 354, 0)
[node1]->::Apr 24 2016 02:59:52 GMT: INFO (info): (thr_info.c::4828)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (189, 5199, 5010) : hb (50, 50, 0) : fab (354, 354, 0)
[node0]->::Apr 24 2016 03:00:01 GMT: INFO (info): (thr_info.c::4828)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (183, 5192, 5009) : hb (50, 50, 0) : fab (354, 354, 0)

You can use multiple strings here too.

Example-3: Grep for multiple Unique strings. (using -u)

Admin> loggrep server show -i -u -s 'warning' 'stacktrace'

************************** cluster **************************
[node1]->::Apr 24 2016 00:15:03 GMT: WARNING (as): (signal.c::162) stacktrace: found 7 frames
[node1]->::Apr 24 2016 00:15:03 GMT: WARNING (as): (signal.c::162) stacktrace: frame 0: /usr/bin/asd(as_sig_handle_segv+0x59) [0x47edd8]
[node1]->::Apr 24 2016 00:15:03 GMT: WARNING (as): (signal.c::162) stacktrace: frame 1: /lib64/libc.so.6() [0x36344326a0]
[node1]->::Apr 24 2016 00:15:03 GMT: WARNING (as): (signal.c::162) stacktrace: frame 2: /lib64/libc.so.6(memcpy+0x15b) [0x36344897cb]
[node1]->::Apr 24 2016 00:15:03 GMT: WARNING (as): (signal.c::162) stacktrace: frame 3: /usr/bin/asd(cf_queue_pop+0x93) [0x538379]
[node1]->::Apr 24 2016 00:15:03 GMT: WARNING (as): (signal.c::162) stacktrace: frame 4: /usr/bin/asd(thr_tsvc_slow+0xca) [0x4cdb18]
[node1]->::Apr 24 2016 00:15:03 GMT: WARNING (as): (signal.c::162) stacktrace: frame 5: /lib64/libpthread.so.0() [0x36348079d1]
[node1]->::Apr 24 2016 00:15:03 GMT: WARNING (as): (signal.c::162) stacktrace: frame 6: /lib64/libc.so.6(clone+0x6d) [0x36344e88fd]

Example-4: Counting the number of occurrences for the grepped string.

Admin> loggrep server count -i -s 'nsup start'

************************** cluster **************************
node0 : Count of ['nsup start'] is : 233
node1 : Count of ['nsup start'] is : 241

Example-5: Grep strings from a specific time-window.

Admin> loggrep server show -i -s 'migrates' -f 'Apr 17 2016 18:11:10' -d 360

************************** cluster **************************
[node0]->::Apr 17 2016 18:11:10 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 0 , 0 ) ::: ClusterSize 4 ::: objects 43549392 ::: sub_objects 0
[node1]->::Apr 17 2016 18:11:10 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 0 , 0 ) ::: ClusterSize 4 ::: objects 43549392 ::: sub_objects 0
[node0]->::Apr 17 2016 18:11:20 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 0 , 0 ) ::: ClusterSize 4 ::: objects 43534319 ::: sub_objects 0
...
...
...
[node0]->::Apr 17 2016 18:17:10 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 0 , 0 ) ::: ClusterSize 4 ::: objects 43395723 ::: sub_objects 0
[node1]->::Apr 17 2016 18:17:10 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 0 , 0 ) ::: ClusterSize 4 ::: objects 43395723 ::: sub_objects 0

-f & -d can take multiple formats. Use help to know more about it.

This can be used with multiple strings too.

Admin> loggrep server show -i -s 'proto' 'clustersize' -f '-01:00:00' -d 3600

************************** cluster **************************
[node0]->::Apr 24 2016 03:18:11 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 0 , 0 ) ::: ClusterSize 26 ::: objects 45443849 ::: sub_objects 0
[node0]->::Apr 24 2016 03:18:11 GMT: INFO (info): (thr_info.c::4828)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (302, 7132, 6830) : hb (50, 50, 0) : fab (354, 354, 0)
[node1]->::Apr 24 2016 03:18:12 GMT: INFO (info): (thr_info.c::4804)  migrates in progress ( 0 , 0 ) ::: ClusterSize 26 ::: objects 44195907 ::: sub_objects 0
[node1]->::Apr 24 2016 03:18:12 GMT: INFO (info): (thr_info.c::4828)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (339, 7959, 7620) : hb (50, 50, 0) : fab (354, 354, 0)

6. Getting diff of consecutive occurrences for a value.

Example-1: Rate of opening and closing connections.

Admin> loggrep server diff -i -s 'fds - proto' -t 10

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0                           .                         |   node1                           .                               
.                   :   Total                        Diff                         |   Total                        Diff                         |
Apr 17 2016 18:11:10:   [2005, 18505616, 18503611]   [2005, 18505616, 18503611]   |   [2005, 18505616, 18503611]   [2005, 18505616, 18503611]   |
Apr 17 2016 18:11:20:   [2005, 18505616, 18503611]   [0, 0, 0]                    |   [2005, 18505616, 18503611]   [0, 0, 0]                    |
Apr 17 2016 18:11:30:   [2005, 18505616, 18503611]   [0, 0, 0]                    |   [2005, 18505616, 18503611]   [0, 0, 0]                    |
Apr 17 2016 18:12:20:   [2004, 18505616, 18503612]   [-1, 0, 1]                   |   [2004, 18505616, 18503612]   [-1, 0, 1]                   |

Example-2: For Objects in the cluster.

Admin> loggrep server diff -i -s 'objects' -t 10

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0        .            |   node1        .            |
.                   :   Total        Diff         |   Total        Diff         |
Apr 17 2016 18:11:10:   [43549392]   [43549392]   |   [43549392]   [43549392]   |
Apr 17 2016 18:11:20:   [43534319]   [-15073]     |   [43534319]   [-15073]     |
Apr 17 2016 18:11:30:   [43533736]   [-583]       |   [43533736]   [-583]       |
...
...

By default slice interval is 60 secs. If not specified, the output will look like below.

Admin> loggrep server diff -i -s 'objects'

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0         .             |   node1         .              |
.                   :   Total         Diff          |   Total         Diff           |
Apr 17 2016 18:11:10:   [261180762]   [261180762]   |   [261180762]   [261180762]    |
Apr 17 2016 18:12:10:   [261027933]   [-152829]     |   [261027933]   [-152829]      |
Apr 17 2016 18:13:10:   [260877103]   [-150830]     |   [260877103]   [-150830]      |
Apr 17 2016 18:14:10:   [260723664]   [-153439]     |   [260723664]   [-153439]      |

This might seem odd because it is adding object count for 60 seconds in this o/p. Please use -t 10 in case you need absolute values. This can further be optimised to view every nth output by using -k option.

Admin> loggrep server diff -i -s 'objects' -t 10 -k 6

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0        .            |   node1        .            |
.                   :   Total        Diff         |   Total        Diff         |
Apr 17 2016 18:11:10:   [43549392]   [43549392]   |   [43549392]   [43549392]   |
Apr 17 2016 18:12:10:   [43522897]   [1834]       |   [43522897]   [1834]       |
Apr 17 2016 18:13:10:   [43498933]   [1853]       |   [43498933]   [1853]       |
Apr 17 2016 18:14:10:   [43471733]   [1857]       |   [43471733]   [1857]       |

This is will give the output from every (tk) interval. In this case 106=60 secs slice.

Example:3 Getting diff from a specific time-window.

Admin> loggrep server diff -i -s 'fds - proto' -t 10 -k 6 -f 'Apr 17 2016 18:11:10' -d 360

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0                        .                            |   node1                        .                            | 
                    :   Total                        Diff                         |   Total                        Diff                         |
Apr 17 2016 18:11:10:   [2005, 18505616, 18503611]   [2005, 18505616, 18503611]   |   [2005, 18505616, 18503611]   [2005, 18505616, 18503611]   |
Apr 17 2016 18:12:10:   [2005, 18505616, 18503611]   [0, 0, 0]                    |   [2005, 18505616, 18503611]   [0, 0, 0]                    |
Apr 17 2016 18:13:10:   [2004, 18505617, 18503613]   [0, 0, 0]                    |   [2004, 18505617, 18503613]   [0, 0, 0]                    |
Apr 17 2016 18:14:10:   [2004, 18505621, 18503617]   [0, 2, 2]                    |   [2004, 18505621, 18503617]   [0, 2, 2]                    |
Apr 17 2016 18:15:10:   [2004, 18505622, 18503618]   [0, 1, 1]                    |   [2004, 18505622, 18503618]   [0, 1, 1]                    |
Apr 17 2016 18:16:10:   [2004, 18505623, 18503619]   [0, 0, 0]                    |   [2004, 18505623, 18503619]   [0, 0, 0]                    |

7. Using loglatency on server logs.

Example-1: Basic.

Admin> loglatency -h writes_master

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0    .        .         .         |   node1    .        .         .         |
.                   :   % >1ms   % >8ms   % >64ms   ops/sec   |   % >1ms   % >8ms   % >64ms   ops/sec   |
Apr 17 2016 18:11:30:   100.00   0.03     0.00      647.6     |   100.00   0.03     0.00      647.6     |
Apr 17 2016 18:11:40:   100.00   0.03     0.00      675.6     |   100.00   0.03     0.00      675.6     |
Apr 17 2016 18:11:50:   100.00   0.00     0.00      676.6     |   100.00   0.00     0.00      676.6     |
Apr 17 2016 18:12:00:   100.00   0.00     0.00      685.3     |   100.00   0.00     0.00      685.3     |

Example-2: Changing the slice interval to every hour.

Admin> loglatency -h writes_master -t 3600

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0    .        .         .         |   node1    .        .         .         |
.                   :   % >1ms   % >8ms   % >64ms   ops/sec   |   % >1ms   % >8ms   % >64ms   ops/sec   |
Apr 17 2016 19:11:20:   100.00   0.01     0.00      480.8     |   100.00   0.01     0.00      480.8     |
Apr 17 2016 20:11:20:   100.00   0.01     0.01      156.3     |   100.00   0.01     0.01      156.3     |
Apr 17 2016 21:11:20:   100.00   0.01     0.00      238.1     |   100.00   0.01     0.00      238.1     |

Example-3: Changing the slice interval to every hour starting from a specific time.

Admin> loglatency -h writes_master -t 3600 -f 'Apr 17 2016 18:11:10'

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0    .        .         .         |   node1    .        .         .         |
.                   :   % >1ms   % >8ms   % >64ms   ops/sec   |   % >1ms   % >8ms   % >64ms   ops/sec   |
Apr 17 2016 19:11:20:   100.00   0.01     0.00      480.8     |   100.00   0.01     0.00      480.8     |
Apr 17 2016 20:11:20:   100.00   0.01     0.01      156.3     |   100.00   0.01     0.01      156.3     |
Apr 17 2016 21:11:20:   100.00   0.01     0.00      238.1     |   100.00   0.01     0.00      238.1     |

Example-4: Changing the number of buckets -b option. Default is 3.

Admin> loglatency -h writes_master -t 900 -d 3600 -b 4

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0    .        .         .          .         |   node1    .        .         .          .         |
.                   :   % >1ms   % >8ms   % >64ms   % >512ms   ops/sec   |   % >1ms   % >8ms   % >64ms   % >512ms   ops/sec   |
Apr 17 2016 18:26:20:   100.00   0.01     0.00      0.00       629.1     |   100.00   0.01     0.00      0.00       629.1     |
Apr 17 2016 18:41:20:   100.00   0.03     0.00      0.00       522.1     |   100.00   0.03     0.00      0.00       522.1     |
Apr 17 2016 18:56:20:   100.00   0.01     0.00      0.00       421.8     |   100.00   0.01     0.00      0.00       421.8     |

Example-4:Show 0-th then every e-th bucket. Using -e option

Admin> loglatency -h writes_master -t 900 -d 3600 -b 4 -e 1

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cluster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                :   node0    .        .        .        .         |   node1    .        .        .        .         |
.                   :   % >1ms   % >2ms   % >4ms   % >8ms   ops/sec   |   % >1ms   % >2ms   % >4ms   % >8ms   ops/sec   |
Apr 17 2016 18:26:20:   100.00   100.00   0.15     0.01     629.1     |   100.00   100.00   0.15     0.01     629.1     |
Apr 17 2016 18:41:20:   100.00   100.00   0.20     0.03     522.1     |   100.00   100.00   0.20     0.03     522.1     |

Topic	Replies	Views
Aerospike Tools Release 3.8.0 (April 15, 2016) Releases (Server, Client & Tools)	1292	April 21, 2016
Aerospike Tools Release 3.6.2.2 (October 9, 2015) Releases (Server, Client & Tools)	1160	October 16, 2015
Aerospike Tools Release 3.6.2 (October 9, 2015) Releases (Server, Client & Tools)	1465	October 16, 2015
Aerospike Tools Release 3.19.0 (April 17, 2019) Releases (Server, Client & Tools)	578	April 18, 2019
Aerospike Tools Release 3.7.3 (February 4, 2016) Releases (Server, Client & Tools)	1272	April 21, 2016