We're evaluating Aerospike as one of the in-memory NoSQL solutions. The primary goal is to have very low latency and very high ingestion rate.
Environment
. Aerospike 3.5.x server installed on a single machine
. CentOS, 20 cores (in HT mode) , so 40 virtual cores
. 125Gb memory.
. This is a dedicated "bare metal" machine for Aerospike.
We ingested 2 million records on this instance (each record has about 7 bins, mostly integer and a couple of string type)
We're using the count UDF described on your website to get a row count. Executing a query like "aggregate count() on <ns>.<set> takes 1.4 secs as measured using aql. While this query is executing, top shows barely 3/4 cores are used out of 40 with utilization around 45%. The rest are idle. Also, most of the memory on this box is also available for use.
The question is - What is preventing us from getting a millsecond response time for this miniscule data when CPU/memory are obviously not a limiting factors? Do we need to tweak any other configuration parameter?
These results were captured after executing "afterburner.sh" script.
Not sure if the following applies but I remember to have seen a lua instance pool size set to about 5 instances. Not sure if it was client- or serverside configuration. Anyways, if you want only counters I would not use UDFs. I had a similiar usecase where I needed to get a count on something and ended up implementing one in application logic using only AS kvs-functionality (incrementing a counter record on inserts, decrementing on deletes) which provides me with sub-millisecond response no matter what ( o(log(1) ). Aggregation counts will become slower and slower with every new entry ( o(log(n) ) and I think you shouldn’t consider any stream UDF as ‘realtime’ as they may take a few secs or even longer.
Unrelated to UDFs, make sure your transaction-queues and service-threads are configured to the number of cores as described in the configuration reference.
As Manuel described, UDFs are added functionality to the key-value operations, using Lua scripts. The server includes LuaJIT which should provide better performance for repeated calls, but still not at the same order of speed as the core key-value operations. If you try to run the same command through AQL several times in a row does the execution time improve?