Aggregation queries showing slowness when under load




With an average record size of 40kb we run certain aggregations(which also involve some mathematical calculations in the filtering phase). The total record count in aerospike is close to 850000. Under increasing load, the response time starts to get affected severely. In case of a single query the response time is somewhere around 750ms (total payload close to 750kb) , and under load(15-20 concurrent connections) the response time elevates to around 8-10 seconds. The test rig(aerospike installation) is a dual core m/c with 30gigs of ram. Are there some additional performance tuning configurations or is this the expected behavior(really shouldn’t be so bad). The target namespace in this case has the “data-in-memory” set to true. Also, the benchmarks that have been published so far, are they considering non-cpu intensive queries? Even in case of primary key based get operations, the performance seems to degrade heavily when under load, with the response times increasing from a mere 100 ms upto 2 seconds.


There is the following query tuning guide that may help:

Its possible that with load your server is reaching a certain network bottleneck.

Is your network using 1G or 10G ? Whats the network usage during these high load period.

You may be able to use a tool like iperf to test your network during load.

How many CPU Cores does these nodes have?