Profiling / Optimizing Aerospike Batch Reads

lzuwei · October 2, 2015, 9:07am

I have questions pertaining to profiling and tuning Aerospike.

Right now I have a following data model to emulate an inverted index in information retrieval.

Cluster ID (Key) of Integer | Document IDs (Map) of String : Double
1 | abc : 1.0, aec: 12.4, yufss: 14.09 
2 | efd : 22.9, erf: 13.6, abc : 87.9
...

Total number of Rows are fixed at 1048576.
The each row can potentially grow to a rather large size of say 10k entries

We are suffering from database performance issue right now when performing batch reads against this set. Typically we perform a batch read on around 1000 rows and do further processing against the doc id contained in the rows.

A Batch read of such dimension takes us up to 2000 ms to complete, we want to keep the latency lower than 300 ms consistently. In addition, the reads have very large jitter and it is not acceptable for our use case.

Configuration I have so far:

service {
    user root
    group root
    paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
    pidfile /var/run/aerospike/asd.pid
    service-threads 16
    transaction-queues 16
    transaction-threads-per-queue 3
    proto-fd-max 15000
    batch-threads 12
    batch-max-requests 10000
}

namespace iq {
    ldt-enabled true
    replication-factor 2
    memory-size 80G
    default-ttl 30d # 30 days, use 0 to never expire/evict.

    storage-engine device {
            device /dev/xvdb
            write-block-size 1024K
            data-in-memory true
    }
}

3 node cluster with replication factor 2 each cluster is a 16 CPU machine with 120GB RAM. Cluster is configured with Thread affinity using taskset and irq configs to spread the NIC load.

Questions:

Is this achievable on Aerospike to keep the batch read latency to < 300ms and if it is possible what are the steps to perform this.
How does batch read work internally in Aerospike, what are the considerations that will affect the performance.
How do you monitor a breakdown of the latency for a batch read into say processing time of read and data transfer to client.

Topic		Replies	Views
Batch reads performance Tuning	1	2137	November 28, 2019
Aerospike batch requests performance tuning Tuning	13	1743	November 18, 2022
Latency during batch read Tuning query	3	1403	September 4, 2018
Aerospike slow performance write/batch-read	3	3575	October 16, 2017
Low performance in Aerospike for batchRequests Tuning	1	2094	December 29, 2015

Profiling / Optimizing Aerospike Batch Reads

Related topics