Profiling / Optimizing Aerospike Batch Reads

modeling
index
query
map

#1

I have questions pertaining to profiling and tuning Aerospike.

Right now I have a following data model to emulate an inverted index in information retrieval.

Cluster ID (Key) of Integer | Document IDs (Map) of String : Double
1 | abc : 1.0, aec: 12.4, yufss: 14.09 
2 | efd : 22.9, erf: 13.6, abc : 87.9
...

Total number of Rows are fixed at 1048576.
The each row can potentially grow to a rather large size of say 10k entries

We are suffering from database performance issue right now when performing batch reads against this set. Typically we perform a batch read on around 1000 rows and do further processing against the doc id contained in the rows.

A Batch read of such dimension takes us up to 2000 ms to complete, we want to keep the latency lower than 300 ms consistently. In addition, the reads have very large jitter and it is not acceptable for our use case.

Configuration I have so far:

service {
    user root
    group root
    paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
    pidfile /var/run/aerospike/asd.pid
    service-threads 16
    transaction-queues 16
    transaction-threads-per-queue 3
    proto-fd-max 15000
    batch-threads 12
    batch-max-requests 10000
}

namespace iq {
    ldt-enabled true
    replication-factor 2
    memory-size 80G
    default-ttl 30d # 30 days, use 0 to never expire/evict.

    storage-engine device {
            device /dev/xvdb
            write-block-size 1024K
            data-in-memory true
    }
}

3 node cluster with replication factor 2 each cluster is a 16 CPU machine with 120GB RAM. Cluster is configured with Thread affinity using taskset and irq configs to spread the NIC load.

Questions:

  1. Is this achievable on Aerospike to keep the batch read latency to < 300ms and if it is possible what are the steps to perform this.

  2. How does batch read work internally in Aerospike, what are the considerations that will affect the performance.

  3. How do you monitor a breakdown of the latency for a batch read into say processing time of read and data transfer to client.