I am currently doing performance benchmarking of Aerospike database in VM environment.
Following is the setup->
- Storage-engine - In-memory
- Cluster size - 2 Node Heartbeat type -
- Multicast (tried Mesh as well)
-
of network interfaces - 2 * 1 Gbps (1 for client-server and another for cluster)
- Other config details - service-threads and transaction-queues parameter matches # of vCPUs available.
- Client - Aerospike benchmark utility. Tried with varying number of threads starting from 20 up to 700. Data size - String of length 100 and 500. (Tried up to 3 clients with 4 vCPUs each)
- Operation - 100 % Write
I am running performing tests on Openstack public cloud. Following is our observation →
No of vCPUs / TPS / Server Side CPU utilization (%)
- 2 / 22-23k / 180
- 4 / 28-29k / 245
- 8 / 60-62k / 520
Neither the client vCPU nor the network were the bottleneck.
Also tried the similar setup in my local lab, where results were similar. In my local setup, I even tried with separating out CPU for network I/O and Aerospike using the smp_affinity setting and taskset, there was no change in the observation.
Questions -
- Apart from “service-threads” and “transaction-queues” is there any other config parameter that needs to be tweaked?
- What could be the other reasons for not able to max out the CPU for higher number of vCPUs ?