High latency and cpu is not getting utilised completely

Tamilarasu · June 27, 2024, 6:29am

Hi Team, I am trying out the aerospike for our production use case and when i cam hitting the server the cpu usage is not going above 5% due to this latency is very low. Aerospike version - 7.0.0.10 Cluster - 3 node cluster (each 2 core / 8gb ram) Client - aerospike-client-jdk21:8.1.1

Client

@Configuration
public class Config {

    private static final Host[] hosts = new Host[] {
            new Host("host1", 3000),
            new Host("host2", 3000),
            new Host("host3", 3000)
    };

    @Bean
    public AerospikeClient aerospikeClient() {
        ClientPolicy clientPolicy = new ClientPolicy();
        clientPolicy.connPoolsPerNode = 10; //tried increasing these nothing is working
        clientPolicy.maxConnsPerNode = 300; //same here
        return new AerospikeClient(clientPolicy, hosts);
    }
}

@RestController
@RequestMapping("/aerospike")
@Slf4j
public class Controller {

    @Autowired
    private AerospikeClient aerospikeClient;

    private List<String> list = new ArrayList<>();
    private List<Long> readLatencies = new ArrayList<>();

    @PostMapping
    public void write(@RequestBody Payload payload) {
        String primaryKey = String.valueOf(UUID.randomUUID());
        try {
            WritePolicy writePolicy = aerospikeClient.getWritePolicyDefault();
            writePolicy.socketTimeout = 30000;
            writePolicy.totalTimeout = 30000;
            Key key = new Key("perf", "", primaryKey);
            Bin bin = new Bin("data", payload.getPayload());
            aerospikeClient.put(writePolicy, key, bin);
        } catch (Exception e) {
            log.error("Error while writing to aerospike: {}", primaryKey, e);
        }
    }

    @GetMapping
    public String read(@RequestParam String primaryKey) {
        String data = null;
        try {
            Key key = new Key("perf", "", primaryKey);
            Policy policy = new Policy(aerospikeClient.getReadPolicyDefault());
            policy.socketTimeout = 30000;
            policy.totalTimeout = 30000;
            data = aerospikeClient.get(policy, key).bins.get("data").toString();
        } catch (Exception e) {
            log.error("Error while reading from aerospike: {}", primaryKey, e);
        }
        return data;
    }
}

tried increasing the hits to check if there is any issue with client config but still the same looks i need to change configuration in both client and server.

currently having ~500TPS to ~800TPS from client

server configs

Tamilarasu · June 27, 2024, 9:59am

also in the histogram everything seems to be less than 1ms but from jmeter 95% is 120ms for ~500TPS

pgupta · June 27, 2024, 2:54pm

Histogram latency is at the server.

In this analogy, in your case, looks like, 1 ms is the “service time”, 120 ms is the “response time”. So, Aerospike server is doing its part. You may want to investigate the rest of the path, getting to Aerospike and back.

Tamilarasu · June 27, 2024, 5:30pm

yeah got it i also thought it should be the delay in my network calls or something else that i need check… but why i am not seeing the high cpu usage even though service-thread parameter is 10 in each node… currently the cpu usage is not going above 10% ideally it should be 100% due to service-thread param coz i am using 2 core machine for all 3 servers … is this something to do with client or server configs?

pgupta · June 27, 2024, 6:00pm

Typically folks worry when cpu usage is high. That can happen if you are using things Transport Layer Security (TLS) i.e. encryption on client to server connections, TLS on fabric, TLS on heartbeat, encryption-at-rest on device data etc. - any similar computationally intensive work. You might want to search on this forum on “high cpu” related discussions.

Tamilarasu · June 27, 2024, 6:45pm

the requests are not using TLS… eventhough it uses TLS my doubt is always why server it is not using the all the threads to read / write record and if there is a less cpu usage then we cannot able to get the required TPS and all requests will be submitted to server and will be in waiting stage coz it uses a less resources to perform the operations…

PS: stuck on the same doubt for last 2 days and searched almost everything unable to find anything to help me

can you check the configs are on point on the client and server end? CMIIW with anything

pgupta · June 27, 2024, 8:45pm

I am suspecting your issue may be how you have written your application. You may be thinking, again I am guessing, that your read requests are pipelined on the socket and the response are pipelined as well and server parallelization would improve throughput. In Aerospike, for reads, a single read transaction happens on a dedicated socket, blocked till the response is received back to the client.

So if you are doing individual get() calls in a loop in a single threaded application, you are sending those to the server sequentially, waiting on each response to send the next one.

To achieve higher throughput, you have to multi-thread your application or if your application is amenable to use batch reads, try that, which are then parallelized by the client library. Batch response is pipelined back to the client library.

Topic		Replies	Views
Not able to get the required Throughput time Tuning	8	6356	October 17, 2014
Java Client: client machine cpu usage shot up to 90% while reading aerospike servers Java Client	1	1659	February 24, 2016
Not able to achieve 1Million TPS in Aerospike Benchmarks despite of capable hardware Aerospike Server Benchmarks	19	9385	March 29, 2017
Why Aerospike server is very slow Tuning	1	2618	March 17, 2015
Aerospike cpu high	2	378	October 5, 2024

High latency and cpu is not getting utilised completely

Related topics