Question regarding 'asadm show latency' missing some stats from a few nodes

Hi, on aerospike server version 4.9.0.30, we noticed that for some of the db-nodes, the asadm latency command is missing batch-index metrics. In the aerospike log, there doesn’t seem to be a message about the metric being collected either.

The aerospike config from a working node and non working node appears to be the same.

Output from asadm command:

[root@npcvps1pcar-oame-1 ~]# /opt/tpa/aerospike/bin/asadm -h ${HOSTNAME/oame/db} -e 'show latency'
Seed:        [('npcvps1pcar-db-1', 3000, None)]
Config_file: /root/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~batch-index Latency (2024-02-23 03:02:54 UTC)~~~~~~~~~~~~~~~~
                 Node                 Time   Ops/Sec   %>1Ms   %>8Ms   %>64Ms
                    .                 Span         .       .       .        .
npcvps1pcar-db-2:3000   03:02:38->03:02:48       0.0     0.0     0.0      0.0
Number of rows: 1

~~~~~~~~~~~~~~~~~~~query Latency (2024-02-23 03:02:54 UTC)~~~~~~~~~~~~~~~~~~~
                 Node                 Time   Ops/Sec   %>1Ms   %>8Ms   %>64Ms
                    .                 Span         .       .       .        .
npcvps1pcar-db-1:3000   03:02:39->03:02:49       1.8     0.0     0.0      0.0
npcvps1pcar-db-2:3000   03:02:38->03:02:48       1.8     0.0     0.0      0.0
npcvps1pcar-db-3:3000   03:02:43->03:02:53       1.6     0.0     0.0      0.0
Number of rows: 3

Any info’s appreciated on why this might be happening

The other two nodes just likely wouldn’t have had any batch transactions processed on them since they were last restarted.

Hi, The concern is why it does not have any batch transactions processes. We restarted the db node but still issue is same .

As I checked, batch-index logs are populated in aerospike.log , If it is not present there then “show latency” will not have output to that particular db node in batch-index section.

Now concern is how these logs (batch-index) are printed in aerospike.log file and if it is not printing there where does we can check it to update.

Restarting the node would clear all the statistics, but if you are sure you had batch transactions hit all the nodes, asadm should show the latencies. If you have an Enterprise license, I would suggest reaching out to Support and it should be something easy to get to the bottom of, otherwise, maybe try sharing the log files of the three nodes and the asadm ‘info command’ output along with the output of the show latency (or show latencies which would be the latest, but seems your version still has the ‘singular’ keyword supported).

Thanks. All db nodes are on Aerospike Enterprise Edition build 4.9.0.30. Sample of show latency is shared by [iandntran] . Basically following types of logs are not getting printed in problematic nodes.

Working node

[root@SPS221-db-0 logs]# cat aerospike.log|grep -i batch-index|tail
Mar 11 2024 11:24:43 GMT: INFO (info): (ticker.c:408)    batch-index: batches (800,0,0) delays 0
Mar 11 2024 11:24:43 GMT: INFO (info): (hist.c:240) histogram dump: batch-index (800 total) msec
Mar 11 2024 11:24:53 GMT: INFO (info): (ticker.c:408)    batch-index: batches (800,0,0) delays 0
Mar 11 2024 11:24:53 GMT: INFO (info): (hist.c:240) histogram dump: batch-index (800 total) msec
Mar 11 2024 11:25:03 GMT: INFO (info): (ticker.c:408)    batch-index: batches (800,0,0) delays 0
Mar 11 2024 11:25:03 GMT: INFO (info): (hist.c:240) histogram dump: batch-index (800 total) msec
Mar 11 2024 11:25:13 GMT: INFO (info): (ticker.c:408)    batch-index: batches (800,0,0) delays 0
Mar 11 2024 11:25:13 GMT: INFO (info): (hist.c:240) histogram dump: batch-index (800 total) msec
Mar 11 2024 11:25:23 GMT: INFO (info): (ticker.c:408)    batch-index: batches (800,0,0) delays 0
Mar 11 2024 11:25:23 GMT: INFO (info): (hist.c:240) histogram dump: batch-index (800 total) msec

Show latency on Working node from LAB (showing all DB nodes in command)

[root@SPS221-oame-1 ~]# /opt/tpa/aerospike/bin/asadm -h ${HOSTNAME/oame/db} -e 'show latency'
Seed:        [('SPS221-db-1', 3000, None)]
Config_file: /root/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~batch-index Latency (2024-03-11 11:28:49 UTC)~~~~~~~~~~~~~~
            Node                 Time   Ops/Sec   %>1Ms   %>8Ms   %>64Ms
               .                 Span         .       .       .        .
SPS221-db-0:3000   11:28:33->11:28:43       0.0     0.0     0.0      0.0
SPS221-db-1:3000   11:28:38->11:28:48       0.0     0.0     0.0      0.0
SPS221-db-2:3000   11:28:39->11:28:49       0.0     0.0     0.0      0.0
SPS221-db-3:3000   11:28:30->11:28:40       0.0     0.0     0.0      0.0
Number of rows: 4

Whereas in problematic node these are not printed in show latency and aerospike.log file

[root@npcvps1pcar-oame-1 ~]# /opt/tpa/aerospike/bin/asadm -h ${HOSTNAME/oame/db} -e 'show latency'
Seed:        [('npcvps1pcar-db-1', 3000, None)]
Config_file: /root/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~batch-index Latency (2024-02-23 03:02:54 UTC)~~~~~~~~~~~~~~~~
                 Node                 Time   Ops/Sec   %>1Ms   %>8Ms   %>64Ms
                    .                 Span         .       .       .        .
npcvps1pcar-db-2:3000   03:02:38->03:02:48       0.0     0.0     0.0      0.0
Number of rows: 1

So I am not sure how to check if batch transaction hit is received or not. Also in Lab We do not have traffic …so here there should be no batch transaction hits.

Your views are very much appriciated.

Thanks for the input. I took the liberty of re-formatting a bit for better readability.

Just to be clear, the following log lines are showing that the node has processed 800 batch-index transactions since it was last restarted but is not processing any during the time covered by this log extract…

Mar 11 2024 11:24:43 GMT: INFO (info): (ticker.c:408)    batch-index: batches (800,0,0) delays 0
Mar 11 2024 11:24:43 GMT: INFO (info): (hist.c:240) histogram dump: batch-index (800 total) msec
...
...
Mar 11 2024 11:25:23 GMT: INFO (info): (ticker.c:408)    batch-index: batches (800,0,0) delays 0
Mar 11 2024 11:25:23 GMT: INFO (info): (hist.c:240) histogram dump: batch-index (800 total) msec

Are you sure the other nodes have received transactions since they were started?

Hi Meher,

Thanks you so much for the valuable inputs provided Indeed it is found correct. I have tested it in my lab and found the same behaviour shared by you.

Just one query…Is there any way we can check that the node has received any batch transaction request hit or not ?

Yes, both those lines would only be printed the first time a batch-index transaction hits a node after it has been restarted.

You can also directly get the batch-index statistic (that would likely be more convenient I guess):

Admin> show stat like batch_index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics (2024-03-12 22:44:00 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~
Node                               |jupyter-aerospike-2dexamp-2dctive-2dnotebooks-2dlrihps4b:3000
batch_index_complete               |53                                                           
batch_index_created_buffers        |10                                                           
batch_index_delay                  |0                                                            
batch_index_destroyed_buffers      |0                                                            
batch_index_error                  |0                                                            
batch_index_huge_buffers           |0                                                            
batch_index_initiate               |53                                                           
batch_index_proto_compression_ratio|1.0                                                          
batch_index_proto_uncompressed_pct |0.0                                                          
batch_index_queue                  |0:0,0:0                                                      
batch_index_timeout                |0                                                            
batch_index_unused_buffers         |10                                                           
Number of rows: 13

Admin> 

Or with asinfo:

sandbox:~$ asinfo -v statistics -l | grep batch_index
batch_index_initiate=53
batch_index_queue=0:0,0:0
batch_index_complete=53
batch_index_error=0
batch_index_timeout=0
batch_index_delay=0
batch_index_unused_buffers=10
batch_index_huge_buffers=0
batch_index_created_buffers=10
batch_index_destroyed_buffers=0
batch_index_proto_uncompressed_pct=0.000
batch_index_proto_compression_ratio=1.000
sandbox:~$

Thank you so much Meher.