AeroSpark read performance

Vadim_Dobroskok · August 10, 2016, 7:38am

I am using latest AeroSpark connector to work with Spark ML. But when i have inserted round 60M records to AeroSpike, i got too big time amount in read operations. For example for fetch round 500K records from set that contains 60M records, AeroSpark spend ~30 mins. When i look at htop cmd output, AeroSpike use only 7% of CPU.

How can i speed up performance in read operations? Seems AeroSpark is working only by one thread, how i can parallelize this job? Any suggestions?

AeroSpike conf: memory-size 8G # Maximum memory allocation for data and
default-ttl 30d

storage-engine device {             # Configure the storage-engine to use                                                                                                                                       
    file /vol/rmla.data             # Location of data file on server.                                                                                                                                          
    filesize 900G                   # Max size of each file in GiB.

}

Topic		Replies	Views
Multi-threaded performance Planning	3	2613	May 8, 2015
Aerospike batch requests performance tuning Tuning	13	1734	November 18, 2022
When the writes of per second up to 3500, the read request time increase by about 80%, why?	1	801	January 20, 2020
Spark Streaming join with Aerospark RDD Spark stream , spark	10	2803	August 30, 2019
Read and Write Performance Issue with decent SSD Aerospike Server Benchmarks	0	2813	November 8, 2016

AeroSpark read performance

Related topics