Aerospike 4.5 to latest version upgrade suggestion


We are using Aerospike server version (some servers on & also) on 8 node bare metal cluster. We primarily use batch queries and a few other operations periodically like (write, scan, secondary index, operation etc). the current batch read TPS is approx 1M.

we often face timeout issues and writes and reads start to fail from Java & Go client and connections on servers start to shoot up (30-70k). We have a strict latency requirement of < 30ms for batch query (usually 5-6 keys per query to diff set).

What are the options I have to improve the performance? can I expect major performance improvement by upgrading to any latest server version (say 6+ or the latest 7+) if there is no other option in the current version? if so, what are the steps and risks involved in it?

Java client (v4.4.18)

ClientPolicy clientPolicy = new ClientPolicy();
clientPolicy.eventLoops = new NioEventLoops(eventPolicy, eventLoopGroup);
clientPolicy.maxConnsPerNode = 300;

BatchPolicy batchPolicy = new BatchPolicy();
batchPolicy.socketTimeout = 30; //30ms
batchPolicy.totalTimeout = 60; //60ms
batchPolicy.maxRetries = 0; // No retry
batchPolicy.timeoutDelay = 60; // an attempt to recover the socket in the background after socket read timeout to avoid closing the socket
batchPolicy.replica = Replica.MASTER_PROLES; // Spread load between master and replica

// long set with 1-2M record with secondery index
QueryPolicy queryPolicy = new QueryPolicy();
queryPolicy.maxConcurrentNodes = 1;
queryPolicy.recordQueueSize = 10000;
queryPolicy.socketTimeout = 370000; //370s

clientPolicy.queryPolicyDefault = queryPolicy;
clientPolicy.batchPolicyDefault = batchPolicy;

server config:

service {
  user root
  group root
  paxos-single-replica-limit 1
  pidfile /var/run/aerospike/
  transaction-threads-per-queue 4
  proto-fd-max 100000
  auto-pin cpu

network {
  service {
    address bond0
#    address enp94s0f1
    port 3000
    access-address bond0
#    alternate-access-address enp94s0f1

heartbeat {

    mode mesh
    port 3002

    mesh-seed-address-port 10.x.x.x9 3002 
    mesh-seed-address-port 10.x.x.x0 3002 
    mesh-seed-address-port 10.x.x.x1 3002 
    mesh-seed-address-port 10.x.x.x2 3002 

    interval 150
    timeout 10

namespace store {
  replication-factor 2
  memory-size 576G
  default-ttl 0
  high-water-disk-pct 70 
  high-water-memory-pct 95
  stop-writes-pct 98
  partition-tree-sprigs 8192

  storage-engine device {
    file /opt/aerospike/data/
    filesize 1000G
    data-in-memory true