Aerospike is not horizontally scaling

We are using Aerospike for storing single bin which is hash map. Our java client (3.1.5) is hitting Aerospike server (3.5.9) with the batch of 20 keys in one request.

We put 100 java client machines and 20 Aerospike server machine, Data set is of size 20 million, but still we are not able to scale more than 120 K request/sec. We need to know the bottleneck. We are not reaching at network limit(check via iperf), our clients are able to generate more load, server side CPU and memory utilisation is under 10 %. But still we are not able to get more QPS from the setup.

Could you help us figuring out the root cause.?

Server :- 20 core, 53 GB (In memory), 458G (SSD), 2000Mbps (Network bandwidth)

Java Client :- 14 core, 24 GB(In memory), 1400 Mbps (Network bandwidth).

Aerospike.conf is provided below.
service {
	user root
	service-threads 20
	transaction-queues 20
	transaction-threads-per-queue 3
	proto-fd-max 50000
	nsup-period 43200
	migrate-threads 4
}
logging {
        file /var/log/aerospike/aerospike.log {
		context any info	
        }
}
network {
	service {
		address any
		port 3000
	}
	heartbeat {
		mode mesh
		port 3002
		// All 20 machines are mentioned here
		interval 150
		timeout 10
	}
	fabric {
		port 3001
	}
	info {
		port 3003
	}
}
namespace mapping {
	replication-factor 3
	memory-size 40G
	single-bin true
	storage-engine device {
		file /storage/aerospike/mapper.dat
		filesize 250G
		data-in-memory true
	}
	stop-writes-pct 70	
	high-water-memory-pct 80
	high-water-disk-pct 80	
}

I am guessing a lot of variables here for your setup. But, a few probable reasons:

-samir