java.net.SocketException: Connection reset, Error -8 on large set query

Hi,

I am getting Connection reset error sometimes while querying a large set. I am using secondary index value (always 1) as a filter. This query should return approx ~1.5M record (all records in the set). I am using 4.5.0.9 server version (8 nodes) and 4.4.9 java client version (~30 clients query every hour).

2020-05-20 12:42:50.739 [ERROR] [main] : Exception in reloading value cache
com.aerospike.client.AerospikeException$Connection: Error -8,1,210000,0,5,BB912D1B96B1FAC 10.15.20.75 3000: java.net.SocketException: Connection reset
  at com.aerospike.client.command.SyncCommand.executeCommand(SyncCommand.java:168) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.command.SyncCommand.execute(SyncCommand.java:75) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.command.MultiCommand.executeAndValidate(MultiCommand.java:98) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.query.QueryExecutor$QueryThread.run(QueryExecutor.java:134) ~[aerospike-client-4.4.9.jar:?]
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
  at java.lang.Thread.run(Thread.java:834) ~[?:?]
Caused by: java.net.SocketException: Connection reset
  at java.net.SocketInputStream.read(SocketInputStream.java:186) ~[?:?]
  at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?]
  at com.aerospike.client.cluster.Connection.readFully(Connection.java:273) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.command.MultiCommand.parseResult(MultiCommand.java:153) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.command.SyncCommand.executeCommand(SyncCommand.java:114) ~[aerospike-client-4.4.9.jar:?]
  ... 6 more
2020-05-20 12:42:50.740 [ERROR] [main] : Exception:
com.aerospike.client.AerospikeException$Connection: Error -8,1,210000,0,5,BB912D1B96B1FAC 10.15.20.75 3000: java.net.SocketException: Connection reset
  at com.aerospike.client.command.SyncCommand.executeCommand(SyncCommand.java:168) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.command.SyncCommand.execute(SyncCommand.java:75) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.command.MultiCommand.executeAndValidate(MultiCommand.java:98) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.query.QueryExecutor$QueryThread.run(QueryExecutor.java:134) ~[aerospike-client-4.4.9.jar:?]
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
  at java.lang.Thread.run(Thread.java:834) ~[?:?]
Caused by: java.net.SocketException: Connection reset
  at java.net.SocketInputStream.read(SocketInputStream.java:186) ~[?:?]
  at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?]
  at com.aerospike.client.cluster.Connection.readFully(Connection.java:273) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.command.MultiCommand.parseResult(MultiCommand.java:153) ~[aerospike-client-4.4.9.jar:?]
  at com.aerospike.client.command.SyncCommand.executeCommand(SyncCommand.java:114) ~[aerospike-client-4.4.9.jar:?]
  ... 6 more

client policy:

  public static ClientPolicy getClientPolicy(int eventLoopCount) {
    EventPolicy eventPolicy = new EventPolicy();
    eventPolicy.minTimeout = 5;
    EventLoopGroup group = new EpollEventLoopGroup(eventLoopCount);

    EventLoops eventLoops = new NettyEventLoops(eventPolicy, group);
    ClientPolicy clientPolicy = new ClientPolicy();
    clientPolicy.eventLoops = eventLoops;
    clientPolicy.maxConnsPerNode = 9000;
    
    BatchPolicy batchPolicy = new BatchPolicy();
    batchPolicy.socketTimeout = 10; //10ms
    batchPolicy.totalTimeout = 30; // 30ms
    batchPolicy.maxRetries = 1; // Retry max 1, so total attempt = 2

    // Default for write/query/scan: 0 (no retries)
    QueryPolicy queryPolicy = new QueryPolicy();
    queryPolicy.compress = true;
    queryPolicy.recordQueueSize = 10000;
    queryPolicy.socketTimeout = 210000; //210s

    clientPolicy.queryPolicyDefault = queryPolicy;
    clientPolicy.batchPolicyDefault = batchPolicy;
    
    return clientPolicy;
  }

How can I solve it?

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.