UDF performance

jiten · June 19, 2015, 8:40pm

Hi,

 We're evaluating Aerospike as one of the in-memory NoSQL solutions. The primary goal is to have very low latency and very high ingestion rate. 

Environment
   .  Aerospike 3.5.x server installed on a single machine
   . CentOS, 20 cores (in HT mode) , so 40 virtual cores
   . 125Gb memory.
   . This is a dedicated "bare metal" machine for Aerospike.

We ingested 2 million records on this instance (each record has about 7 bins, mostly integer and a couple of string type)
 We're using the count UDF described on your website to get a row count. Executing a query like "aggregate count() on <ns>.<set> takes 1.4 secs as measured using aql. While this query is executing, top shows barely 3/4 cores are used out of 40 with utilization around 45%. The rest are idle. Also, most of the memory on this box is also available for use.

The question is - What is preventing us from getting a millsecond response time for this miniscule data when CPU/memory are obviously not a limiting factors? Do we need to tweak any other configuration parameter? 

These results were captured after executing "afterburner.sh" script.

ManuelSchmidt · June 20, 2015, 9:47am

Not sure if the following applies but I remember to have seen a lua instance pool size set to about 5 instances. Not sure if it was client- or serverside configuration. Anyways, if you want only counters I would not use UDFs. I had a similiar usecase where I needed to get a count on something and ended up implementing one in application logic using only AS kvs-functionality (incrementing a counter record on inserts, decrementing on deletes) which provides me with sub-millisecond response no matter what ( o(log(1) ). Aggregation counts will become slower and slower with every new entry ( o(log(n) ) and I think you shouldn’t consider any stream UDF as ‘realtime’ as they may take a few secs or even longer.

rbotzer · June 20, 2015, 2:24pm

Unrelated to UDFs, make sure your transaction-queues and service-threads are configured to the number of cores as described in the configuration reference.

As Manuel described, UDFs are added functionality to the key-value operations, using Lua scripts. The server includes LuaJIT which should provide better performance for repeated calls, but still not at the same order of speed as the core key-value operations. If you try to run the same command through AQL several times in a row does the execution time improve?

jiten · June 20, 2015, 6:50pm

Running the same command multiple times in AQL doesn’t help. We’ll configure transaction-queues and service-threads as suggested and keep you posted.

Thanks for your suggestion and comments.

jiten · June 22, 2015, 8:54pm

Hi,

The transaction-queues and service-threads parameters were already configured to the number of cores. It still uses only 4 cores for some reason.

Thanks

Topic		Replies	Views
Aggregations (stream UDF) too slow (leaves 23/24 cores idle) Aggregation	3	2820	August 10, 2015
High Query latency Query & Indexing query , udf	5	4792	April 19, 2016
Latency issues: High latency despite idle system resources - Seeking troubleshooting guidance Operations udf	6	105	July 25, 2024
How to increase threads used by UDFs? Aerospike Server Benchmarks benchmark , secondary , index	28	8449	May 27, 2015
Poor streaming UDF performance Tuning secondary , udf , index	4	2128	January 1, 2018

UDF performance

Related topics