Slow queryAggreate with multithreading

aggregation
udf
java
query

#1

We’re creating a Java client to write data directly into memory in Aerospike, and another Java client to read data from memory. Both clients are multi-threaded.

There are several queryAggregate operations, which was implemented in UDF, inside our read client.

We’re facing one issue as below:

If we allocate 1 thread only for write operation, and 2 threads for read operation, then we have ~25K TPS for reading.

If we allocate 2 threads for write operation, keeping the same number of threads for read operation, then we have only less than 10K TPS for reading.

The Aerospike server is running in a machine which has 24 physical CPU cores. Both writing and reading clients are running at the same time on this machine. The server is almost running Aerospike server only. CPU resource is totally free.

Below is our current Aerospike server configuration:

paxos-single-replica-limit=1;pidfile=null;proto-fd-max=15000;advertise-ipv6=false;auto-pin=none;batch-threads=4;batch-max-buffers-per-queue=255;batch-max-requests=5000;batch-max-unused-buffers=256;batch-priority=200;batch-index-threads=24;clock-skew-max-ms=1000;cluster-name=null;enable-benchmarks-fabric=false;enable-benchmarks-svc=false;enable-hist-info=false;hist-track-back=300;hist-track-slice=10;hist-track-thresholds=null;info-threads=16;log-local-time=false;migrate-max-num-incoming=4;migrate-threads=1;min-cluster-size=1;node-id-interface=null;nsup-delete-sleep=100;nsup-period=120;nsup-startup-evict=true;proto-fd-idle-ms=60000;proto-slow-netio-sleep-ms=1;query-batch-size=100;query-buf-size=2097152;query-bufpool-size=256;query-in-transaction-thread=false;query-long-q-max-size=500;query-microbenchmark=false;query-pre-reserve-partitions=false;query-priority=10;query-priority-sleep-us=1;query-rec-count-bound=18446744073709551615;query-req-in-query-thread=false;query-req-max-inflight=100;query-short-q-max-size=500;query-threads=6;query-threshold=10;query-untracked-time-ms=1000;query-worker-threads=15;run-as-daemon=true;scan-max-active=100;scan-max-done=100;scan-max-udf-transactions=32;scan-threads=4;service-threads=24;sindex-builder-threads=4;sindex-gc-max-rate=50000;sindex-gc-period=10;ticker-interval=10;transaction-max-ms=1000;transaction-pending-limit=20;transaction-queues=4;transaction-retry-ms=1002;transaction-threads-per-queue=4;work-directory=/opt/aerospike;debug-allocations=none;fabric-dump-msgs=false;max-msgs-per-type=-1;prole-extra-ttl=0;service.port=3000;service.address=any;service.access-port=0;service.alternate-access-port=0;service.tls-port=0;service.tls-access-port=0;service.tls-alternate-access-port=0;service.tls-name=null;heartbeat.mode=multicast;heartbeat.multicast-group=239.1.99.222;heartbeat.port=9918;heartbeat.interval=150;heartbeat.timeout=10;heartbeat.mtu=1500;heartbeat.protocol=v3;fabric.port=3001;fabric.tls-port=0;fabric.tls-name=null;fabric.channel-bulk-fds=2;fabric.channel-bulk-recv-threads=4;fabric.channel-ctrl-fds=1;fabric.channel-ctrl-recv-threads=4;fabric.channel-meta-fds=1;fabric.channel-meta-recv-threads=4;fabric.channel-rw-fds=8;fabric.channel-rw-recv-threads=16;fabric.keepalive-enabled=true;fabric.keepalive-intvl=1;fabric.keepalive-probes=10;fabric.keepalive-time=1;fabric.latency-max-ms=5;fabric.recv-rearm-threshold=1024;fabric.send-threads=8;info.port=3003;enable-security=false;privilege-refresh-period=300;report-authentication-sinks=0;report-data-op-sinks=0;report-sys-admin-sinks=0;report-user-admin-sinks=0;report-violation-sinks=0;syslog-local=-1

Below is aerospike.conf file:

# Aerospike database configuration file for use with systemd.

service {
    paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
    proto-fd-max 15000
}

logging {
    console {
        context any info
    }
}

network {
    service {
        address any
        port 3000
    }

    heartbeat {
        mode multicast
        multicast-group 239.1.99.222
        port 9918

        # To use unicast-mesh heartbeats, remove the 3 lines above, and see
        # aerospike_mesh.conf for alternative.

        interval 150
        timeout 10
    }

    fabric {
        port 3001
    }

    info {
        port 3003
    }
}

namespace test {
    replication-factor 2
    memory-size 4G
    default-ttl 30d # 30 days, use 0 to never expire/evict.

    storage-engine memory
}

namespace bar {
    replication-factor 2
    memory-size 4G
    default-ttl 30d # 30 days, use 0 to never expire/evict.

    storage-engine memory

    # To use file storage backing, comment out the line above and use the
    # following lines instead.
#   storage-engine device {
#       file /opt/aerospike/data/bar.dat
#       filesize 16G
#       data-in-memory true # Store data in memory in addition to file.
#   }
}

Could someone please let us know where our current bottleneck is? How we can increase the reading speed when increasing the number of writing threads?

The above configuration is default, we didn’t change anything yet.


#2

What are the corresponding TPS for writes for the two cases?


#3

Hi pgupta,

Case 1, write speed is ~ 25K

Case 2, write speed is ~ 50K

Please let me know if you need more information on this. Tks.


#4

Finally we found the answer here

Thanks pgupta for your time.