Unable to max out CPU in KVM environment

jin_dev · September 9, 2015, 6:52pm

I am currently doing performance benchmarking of Aerospike database in VM environment.

Following is the setup->

Storage-engine - In-memory
Cluster size - 2 Node Heartbeat type -
Multicast (tried Mesh as well)
of network interfaces - 2 * 1 Gbps (1 for client-server and another for cluster)
Other config details - service-threads and transaction-queues parameter matches # of vCPUs available.
Client - Aerospike benchmark utility. Tried with varying number of threads starting from 20 up to 700. Data size - String of length 100 and 500. (Tried up to 3 clients with 4 vCPUs each)
Operation - 100 % Write

I am running performing tests on Openstack public cloud. Following is our observation →

No of vCPUs / TPS / Server Side CPU utilization (%)

2 / 22-23k / 180
4 / 28-29k / 245
8 / 60-62k / 520

Neither the client vCPU nor the network were the bottleneck.

Also tried the similar setup in my local lab, where results were similar. In my local setup, I even tried with separating out CPU for network I/O and Aerospike using the smp_affinity setting and taskset, there was no change in the observation.

Questions -

Apart from “service-threads” and “transaction-queues” is there any other config parameter that needs to be tweaked?
What could be the other reasons for not able to max out the CPU for higher number of vCPUs ?

sunil · September 11, 2015, 10:02am

In general, there are more factor’s that may effect VM environments, but I am surprised that you are observing the same behavior in local (non-VM) setup too.

When doing our benchmarks in non-VM environment, on a 8 CPU machine, we are able to use the CPU upto 700%. One important question here is…“What all you are accounting ?”. Do you also consider the system time ? Or the numbers quoted by you are (100-idle%) ? If you can share the output of top command from both the VM and non-VM environments, it will help.

3 4vCPU clients may not be able to saturate 2-node 8CPU server node. You can try to increase the clients and see if you can push more throughput. You can see that in a 2vCPU server nodes, the cpu util % is high. It drops as the cores on the server are increasing. It could be because of thread context switching too.

Can you try 100% read load too ? I see that you are doing 100% write load. Aerospike does sync replication over the network. May be the replication component is not catching up. We will know this if you do 100% read load. I am expecting to see higher CPU utilization in this workload.

Topic		Replies	Views
Not able to achieve 1Million TPS in Aerospike Benchmarks despite of capable hardware Aerospike Server Benchmarks	19	9375	March 29, 2017
Where is the bottleneck? Running Aerospike on NVMe Aerospike Server Benchmarks	10	3557	April 13, 2018
Extremely bad performance on VirtualBox VM Tuning	8	5536	April 25, 2019
Read and Write Performance Issue with decent SSD Aerospike Server Benchmarks	0	2812	November 8, 2016
Question about benchmarking results on Aerospike Node.js Client	2	1868	July 16, 2015

Unable to max out CPU in KVM environment

of network interfaces - 2 * 1 Gbps (1 for client-server and another for cluster)

Related topics