We are using Aerospike for storing single bin which is hash map. Our java client (3.1.5) is hitting Aerospike server (3.5.9) with the batch of 20 keys in one request.
We put 100 java client machines and 20 Aerospike server machine, Data set is of size 20 million, but still we are not able to scale more than 120 K request/sec. We need to know the bottleneck. We are not reaching at network limit(check via iperf), our clients are able to generate more load, server side CPU and memory utilisation is under 10 %. But still we are not able to get more QPS from the setup.
Could you help us figuring out the root cause.?
Server :- 20 core, 53 GB (In memory), 458G (SSD), 2000Mbps (Network bandwidth)
Java Client :- 14 core, 24 GB(In memory), 1400 Mbps (Network bandwidth).
Aerospike.conf is provided below.
service {
user root
service-threads 20
transaction-queues 20
transaction-threads-per-queue 3
proto-fd-max 50000
nsup-period 43200
migrate-threads 4
}
logging {
file /var/log/aerospike/aerospike.log {
context any info
}
}
network {
service {
address any
port 3000
}
heartbeat {
mode mesh
port 3002
// All 20 machines are mentioned here
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace mapping {
replication-factor 3
memory-size 40G
single-bin true
storage-engine device {
file /storage/aerospike/mapper.dat
filesize 250G
data-in-memory true
}
stop-writes-pct 70
high-water-memory-pct 80
high-water-disk-pct 80
}