We are trying to use Aerospike in our production environment. We setup 18 Aerospike nodes turn on all writes we usually have and turn on part of our read requests.
Here there is some graphs related to this experiment:
On the first graph you can see when we increase read load. We keep this for more than 3 hours. During the test cluster (18 nodes) was under both write load (~34000 ops/sec, all values on the graphs are per minute) and read load (~13500 ops/sec, but our final goal is 416000 req/sec).
What we see:
3 times Aerospike cluster stop responding with 100% of timeouts
we have 2 nodes with constant high CPU load (even with low reading frequency), we were trying to re-create them from scratch, just copy disk data - all the same. We see that GC alway work on this nodes and eating 90% of CPU (most of it in iowait)
while all nodes were under the same load we can see significant difference in CPU usage from 25% to 95% depends on node
Our setup is on AWS. We use m3.2xlarge instances with shadow devices functionality.
service {
user aerospike
group disk
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
pidfile /var/run/aerospike/asd.pid
service-threads 8
transaction-queues 8
transaction-threads-per-queue 4
}
logging {
# Log file must be an absolute path.
file /var/log/aerospike/aerospike.log {
context any info
}
}
network {
service {
address any
port 3000
}
heartbeat {
mode mesh
port 3002
mesh-seed-address-port 10.2.8.174 3002 # aerospike-aws-ca-1
mesh-seed-address-port 10.2.8.91 3002 # aerospike-aws-ca-2
mesh-seed-address-port 10.2.8.153 3002 # aerospike-aws-ca-3
mesh-seed-address-port 10.2.8.134 3002 # aerospike-aws-ca-4
mesh-seed-address-port 10.2.8.244 3002 # aerospike-aws-ca-5
mesh-seed-address-port 10.2.8.18 3002 # aerospike-aws-ca-6
mesh-seed-address-port 10.2.8.8 3002 # aerospike-aws-ca-7
mesh-seed-address-port 10.2.8.5 3002 # aerospike-aws-ca-8
mesh-seed-address-port 10.2.8.9 3002 # aerospike-aws-ca-9
mesh-seed-address-port 10.2.8.248 3002 # aerospike-aws-ca-10
mesh-seed-address-port 10.2.8.109 3002 # aerospike-aws-ca-11
mesh-seed-address-port 10.2.8.162 3002 # aerospike-aws-ca-12
mesh-seed-address-port 10.2.8.52 3002 # aerospike-aws-ca-13
mesh-seed-address-port 10.2.8.25 3002 # aerospike-aws-ca-14
mesh-seed-address-port 10.2.8.69 3002 # aerospike-aws-ca-15
mesh-seed-address-port 10.2.8.27 3002 # aerospike-aws-ca-16
mesh-seed-address-port 10.2.8.200 3002 # aerospike-aws-ca-17
mesh-seed-address-port 10.2.8.210 3002 # aerospike-aws-ca-18
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace !hidden1! {
replication-factor 1
memory-size 24G
default-ttl 0 # never expire/evict.
high-water-memory-pct 85
high-water-disk-pct 85
stop-writes-pct 85
storage-engine device {
device /dev/sdb /dev/xvdf
device /dev/sdc /dev/xvdg
write-block-size 1M
defrag-lwm-pct 85
}
set !hidden2! {
}
set !hidden3! {
}
set !hidden4! {
}
set !hidden5! {
}
set !hidden6! {
}
set !hidden7! {
}
set !hidden8! {
}
set !hidden9! {
}
set !hidden10! {
}
set !hidden11! {
}
set !hidden12! {
}
}
We do only simple read and write (create/update) operations. Number of queries you can find on the graphs or in the first message.
Main set contains 1,8 bln object with 300 bytes size in average. And most read/write operations related to it.
Not sure what exact data you want from logs. But may be it would be helpfull to know that we starting get 100% timeouts each time when aerospike process was killed by OOM killer. We have 30GB memory on boxes and as you see from config we set memory-size into 24G.
If you can list what exact metrics you want to know I can prepare graphs for you.
Now we are thinking to try to use c3.* type of instances. First attempt shows that they works better than m3.* instances.
This is too high, we typically recommend using 50%, this setting has a non linear write amplification. The write amplification can be plotted as 1/(1-n/100) for n = 0 to 100.
Just to mention how defrag-lwm-pct affects write performance. 7-nodes cluster, ~50K+ w/s. Changing defrag-lwm-pct from 75 to 80 gives 3x write latency increase.