I believe there’s something wrong with the Aerospike cluster.
I setup Aerospike server on 2 VPSes to form a cluster (Hardware: 8 x Intel Xeon E5-2630 v2 @ 2.60GHz + 16G RAM + SSD), it’s easy and looks good.
Then I installed PHP clients on 2 other VPSes (same hardware as the server’s), inserted 2 million rows of meaningful data, modified the client benchmark script and then run rw-concurrent.sh to test the cluster.
In the script rw-worker.php, I disabled all write operations, it’s now 100% reads.
I run the test with single server:
sh rw-concurrent.sh -c 2000 -n 50 run-rw-concurrent.log
This loads 2000 concurrent threads with 50 queries in each thread.
The result is:
Writes: 0 with 0 failed (0% success)
Reads: 100000 with 2 failed (99.998% success)
Max runtime: 1.052120923996s, average runtime: 0.050083642959595 total runtime: 100.16728591919s
102000 transactions executed at average 1018.2965332842tps per-process, combined to 96947.031157407tps
Actual runtime: 5.6628000736237s, actual tps: 18012.290505381, actual rps: 353.35875785555rps
So a single serve can provide 18ktps of 100% read capacity. I think I would expect 2x or near 2x of capacity if I start another server. Then I do the test on the cluster of 2 servers, with the same testing script:
sh rw-concurrent.sh -c 2000 -n 50 run-rw-concurrent.log
The result disappoints me:
Writes: 0 with 0 failed (0% success)
Reads: 100000 with 0 failed (100% success)
Max runtime: 1.4297289848328s, average runtime: 0.061245198488236 total runtime: 122.49039697647s
102000 transactions executed at average 832.71833970456tps per-process, combined to 71342.19217912tps
Actual runtime: 5.8575999736786s, actual tps: 17413.275139706, actual rps: 341.60748582894rps
It’s almost the same as one node, even worse.
I understand the cluster would slow down the write performance because of data replication. But why it also slows down the read performance? Is it the issue of Aerospike server, or my configuration?
Here’s the configuration of my server:
service {
user root
group root
paxos-single-replica-limit 1
pidfile /var/run/aerospike/asd.pid
service-threads 8
transaction-queues 8
transaction-threads-per-queue 4
proto-fd-max 25000
fabric-workers 8
}
logging {
# Log file must be an absolute path.
file /var/log/aerospike/aerospike.log {
context any info
}
}
network {
service {
address any
port 3000
}
heartbeat {
mode multicast
address 239.254.169.78
port 9918
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace test {
replication-factor 2
memory-size 4G
default-ttl 0 # 30 days, use 0 to never expire/evict.
storage-engine device {
file /data/vdc/aerospike/data/test.dat # SSD mounted on /data/vdc
filesize 16G
}
}
I also enabled the servers microbenchmarks, and here’s the log on one server of the cluster:
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads (243623 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000243158) (01: 0000000252) (02: 0000000124) (03: 0000000068)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (04: 0000000008) (05: 0000000006) (06: 0000000007)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (4978 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000004625) (01: 0000000197) (02: 0000000068) (03: 0000000012)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (04: 0000000024) (05: 0000000039) (06: 0000000006) (07: 0000000007)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: proxy (7 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000000003) (01: 0000000003) (02: 0000000001)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_reply (4978 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000004625) (01: 0000000197) (02: 0000000068) (03: 0000000012)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (04: 0000000024) (05: 0000000039) (06: 0000000006) (07: 0000000007)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_cleanup (146245 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000146245)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_net (146245 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000146241) (01: 0000000003) (02: 0000000001)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_net (3051 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000003050) (01: 0000000001)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_storage_read (146245 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000146199) (01: 0000000020) (02: 0000000011) (03: 0000000009)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (04: 0000000005) (06: 0000000001)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_storage_open (146245 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000146245)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_tree (146245 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000146244) (03: 0000000001)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_internal (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_internal (3051 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000003049) (01: 0000000002)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_start (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_start (3051 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000003051)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_q_process (146245 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000146245)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_q_process (3051 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000003051)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: q_wait (149303 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000149201) (01: 0000000049) (02: 0000000030) (03: 0000000018)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (05: 0000000002) (06: 0000000003)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: demarshal_hist (176595 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000176300) (01: 0000000169) (02: 0000000083) (03: 0000000035)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (04: 0000000001) (05: 0000000004) (06: 0000000003)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: wt_master_wait_prole (3051 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000002807) (01: 0000000136) (02: 0000000046) (03: 0000000005)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (04: 0000000019) (05: 0000000025) (06: 0000000006) (07: 0000000007)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_prole (2931 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000002930) (03: 0000000001)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_resolve (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_resolve (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: reads_resolve_wait (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: writes_resolve_wait (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: error (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: batch_q_process (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: info_tr_q_process (27310 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000027309) (02: 0000000001)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: info_q_wait (27310 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000027047) (01: 0000000131) (02: 0000000108) (03: 0000000024)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: info_post_lock (27310 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::154) (00: 0000021301) (01: 0000003286) (02: 0000002218) (03: 0000000444)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (04: 0000000052) (05: 0000000009)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: info_fulfill (27311 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000027302) (01: 0000000009)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: write_storage_close (3051 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000003051)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: write_sindex (0 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: defrag_storage_close (3084 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000003084)
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::137) histogram dump: prole_fabric_send (2931 total) msec
Apr 08 2015 12:20:14 GMT: INFO (info): (hist.c::163) (00: 0000002930) (03: 0000000001)
I have no idea where the bottleneck is.