How to increase threads used by UDFs?

raj · May 25, 2015, 8:41am

I think there is some other bottleneck …

– R

JekaMas · May 25, 2015, 5:52pm

Hello, i have the same problem with similar Lua scripts. In addition to described problem i find In the udf detailed log sometimes there are lines like: DETAIL (query): (aggr.c: as_aggr_istream_read: 400) No More Nodes for this Lua Call

Earlier in the subject, I found similar messages in the log: DETAIL (query): (aggr.c: as_aggr_istream_read: 406) No More Nodes for this Lua Call

Does anyone know what it is? Is it due to the fact that Lua is performed for a long time downloading only part of the CPUs? Where can I find information about this message?

raj · May 26, 2015, 7:23am

JekaMas,

I would imagine you are running something like

aql> Aggregate something.something () on test

which is a scan aggregation. We have just identified regression. I am on fixing it. Please confirm.

nizsheanez is actually running the aggregation over secondary index query

aql> Aggregate something.something () on test where bin = 10

So what we are seeing there is something different.

– R

JekaMas · May 26, 2015, 8:39am

Thanks for answer, raj! I am trying queries with and without secondary index:

aggregate fullsearch.filter_records('active','one') on test.products

aggregate udfActive.filter_records() on test.products WHERE status='active'

Lua scripts: fullsearch.lua:

function filter_records(stream,status,testbin)

   local function map_record(record)
      return map {id=record.id, testBin=record.testBin}
   end

   local function filter_name(record)
      return record.testBin == testbin and record.status == status
   end

    return stream : filter(filter_name) : map(map_record)
end

udfActive.lua:

function filter_records(stream)
   local function map_record(record)
      return map {id=record.id, testBin=record.testBin}
   end

   local function filter_name(record)
      return record.testBin == "one"
   end
  
   return stream : filter(filter_name) : map(map_record)
end

The behavior of both queries are very similar. Execution of each (with and without secondary index), give a load only in 1-4 of 8 processors. Other processors are idle. In a detailed log written messages like:

DETAIL (query): (aggr.c: as_aggr_istream_read: 400) No More Nodes for this Lua Call

With increasing number of processors the execution time of requests is not decreasing. I have the same results for 2-8 processors and 1000000 records: 0.4s for query with secondary index and 1.4s for query without secondary index.

raj · May 26, 2015, 10:41am

Jekamas,

Scan and query aggregation is two different engine on server side.

What is the query configuration you are running with

asinfo -v 'get-config:' | grep query

You may want to bump up query-threads. Check this out

http://www.aerospike.com/docs/operations/manage/queries/

– R

JekaMas · May 26, 2015, 11:33am

raj,

Raj, my running configuration: query-threads=6 query-worker-threads=15 query-priority=10 query-in-transaction-thread=0 query-req-in-query-thread=0 query-req-max-inflight=100 query-bufpool-size=256 query-batch-size=100 query-sleep=1 query-job-tracking=false query-short-q-max-size=500 query-long-q-max-size=500 query-rec-count-bound=4294967295 query-threshold=10 query-untracked-time=1000000 query-hist-track-back=1800 query-hist-track-slice=10 query-hist-track-thresholds=1,8,64 query_rec_count-hist-track-back=1800 query_rec_count-hist-track-slice=10 query_rec_count-hist-track-thresholds=1,8,64

pretty standard i think.

raj, can you describe, what does mean this message in log:

DETAIL (query): (aggr.c: as_aggr_istream_read: 400) No More Nodes for this Lua Call

raj · May 26, 2015, 11:55am

JekaMas,

DETAIL (query): (aggr.c: as_aggr_istream_read: 400) No More Nodes for this Lua Call

This simply means that you are done with the a certain batch that is it. When aggregation is run system chunks it up into batches and runs the Lua aggregation code over it. It is showing up because you have enabled DETAILED log.

Btw why are you running with detail enabled, default is info. It will spew lot of data into the log. Will fill up the log and also slow down the server …

You have enough query threads if you disable all the detailed logging what performance do you see ??

– R

raj · May 26, 2015, 12:00pm

I suggested as similar thing before also. Can you make client simply dump the data into dev/null without processing … This is to make sure slowness is not due to client processing the data slowly. If you are fetching that much data client slowdown would eventually throttle back server …

Btw what client are you using ??

– R

nizsheanez · May 27, 2015, 10:59am

or aql from console

Topic		Replies	Views
Multiple CPU cores and UDFs Tuning	12	4098	April 3, 2015
UDF performance Tuning	4	2481	June 22, 2015
Query Aggregate works only on a partition of set's data Java Client	13	2315	February 8, 2017
Slow Aggregations via UDF Tuning aws , udf , aggregation	5	1199	July 15, 2020
Long running transaction Queueing Error when running performance tests (AER-4079) Query & Indexing	16	3674	August 7, 2015

How to increase threads used by UDFs?

Related topics