Aerospike latency issues

Nikhil_Mogallapalli1 · May 10, 2018, 10:21am

Hi,

We have aerospike running in the Soft layer in bare metal machines in 2 node cluster. our profile average size is 1.5 KB and at peak, operations will be around 6000 ops/sec in each node. The latencies are all fine which is at peak > 1ms will be around 5%.

Now we planned to migrate to aws. So we booted 2 i3.xlarge machines. We ran the benchmark with the 1.5KB object size with the 3x load. results were satisfactory, that is around 4-5%(>1ms). Now we started actual processing, the latencies at peak jumped to 25-30% that is > 1ms and maximum it can accommodate is some 5K ops/sec. So we added one more node, we did benchmark (4.5KB object size and 3x load). The results were 2-4%(>1ms). Now after adding to cluster, the peak came down to 16-22%. We added one more node and peak is now at 10-15%.

The version in aws is aerospike-server-community-3.15.0.2 the version in Sl is Aerospike Enterprise Edition 3.6.3

Our config as follows

# Aerospike database configuration file.

service {
  user xxxxx
  group xxxxx
  run-as-daemon
  paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
  pidfile /var/run/aerospike/asd.pid
  service-threads 8
  transaction-queues 8
  transaction-threads-per-queue 8
  proto-fd-max 15000
}

logging {
  # Log file must be an absolute path.
  file /var/log/aerospike/aerospike.log {
    context any info
  }
}

network {
  service {
    port 13000
    address h1
    reuse-address
  }

  heartbeat {
    mode mesh
    port 13001
    address h1
    
    mesh-seed-address-port h1 13001
    mesh-seed-address-port h2 13001
    mesh-seed-address-port h3 13001
    mesh-seed-address-port h4 13001
    interval 150
    timeout 10
  }

  fabric {
    port 13002
    address h1
  }

  info {
    port 13003
    address h1
  }
}


namespace XXXX {
  replication-factor 2
  memory-size 27G
  default-ttl 10d
  high-water-memory-pct 70
  high-water-disk-pct 60
  stop-writes-pct 90
  storage-engine device {
    device /dev/nvme0n1
    scheduler-mode noop
    write-block-size 128K
  }
}

What should be done to bring down latencies in aws?

Albot · May 14, 2018, 10:58pm

I think the histograms would be a good place to start https://www.aerospike.com/docs/operations/monitor/latency but from what you’ve described I’m not quite sure. One thing I did notice is that you have these thread metrics defined, any reason why you specifically wanted 8 transaction-threads-per-queue? How are you benchmarking, can you share more of those results? How is the benchmark test different from the other latency you reported? Is the app distance from cluster being considered?

Topic		Replies	Views
Aerospike Increased write Latency with I3 series EC2 Boxes with nvme disks Operations	9	1975	March 14, 2017
One node performing poorly in cluster Tuning	4	2945	July 31, 2015
Confusing benchmark results Aerospike Server Benchmarks	2	2787	June 22, 2015
Higher Latencies In Few Particular Nodes query , udf , latency , index	4	819	April 19, 2022
Network throughput issues with asd running ec2 , az , amazon	19	6202	January 25, 2016

Aerospike latency issues

Related topics