How to search bin values that contains or starts with the to be searched value


#1

how to search bin values that contains or starts with the to be searched value


#2

The way to do this (AFAIK) is to use a UDF and the Lua string functions. So, a.) set up some sort of primary key to get a subset of your data, and b.) write a UDF filter to operate on the stream using Lua’s .find or similar.


#3

Hi amitvengs,

Depending on your application/use case Andrew’s suggestion is one way of going about it.

If for some reason you are unable to limit the number of records first by way of creating a secondary index on a bin and applying equality or range filter on it, as of Aerospike Server release 3.4.1 you have the option to perform Scan Aggregations coupled with Streaming UDFs. They work very similar to Query Aggregations except you don’t apply any filter. In other words, Scan Aggregations will result in entire set scan – which may be OK for some use cases and not for others. Please use your best judgement.

Here’s some sample code.

Assumptions: There’s a namespace called ‘test’ and within it is set ‘users’ that is populated with user records. Each record has at least one bin ‘username’ and in some records those values start withusr22’.

Goal: You’d like to retrieve usernames that start with ‘usr22’.

Code using Aerospike Node.js Client v1.0.31 (this can also be done in the latest releases of Java and C# clients):

// Note: UDF registration is done here for convenience. In production env, it should be done via AQL.
client.udfRegister('startsWith.lua', function(err) {
        if ( err.code === aerospike.status.AEROSPIKE_OK ) {
          console.error('startsWith registeration complete!');
          var statement = {aggregationUDF: {module: 'startsWith', funcname: 'find', args: ['usr22']}};
          var query = client.query('test', 'users', statement);
          var stream = query.execute();
          stream.on('data', function(result)  {
            var users = result.users.split(",").filter(function(e){return e});
            console.log('Users: ', users);
            console.log('Count: ', users.length);
          });
          stream.on('error', function(err)  {
            console.log('scanAggregate Error: ',err);
          });
          stream.on('end', function()  {
            // console.log('scanAggregate: ', '!done!');
          });
        } else {
          // An error occurred
          console.error('startsWith registeration error: ', err);
        }
});

Streaming UDFstartsWith.lua:

local function starts_with(map,rec)
  if rec.username:find('^' .. map['chars']) ~= nil then
    map['users'] = map['users'] .. ',' .. rec.username
  end
  -- Return accumulated map 
  return map
end

local function reduce_stats(a,b)
  -- Merge values from map b into a
  a.users = a.users .. b.users
  -- Return updated map 
  return a
end

function find(stream,chars)
  -- Process incoming record stream and pass it to aggregate function, then to reduce function
  --   NOTE: aggregate function starts_with accepts two parameters: 
  --    1) A map that contains usernames and initial chars to match username  
  --    2) function name starts_with -- which will be called for each record as it flows in
  -- Return reduced value of the map generated by reduce function reduce_stats
  return stream : aggregate(map{users='',chars=chars},starts_with) : reduce(reduce_stats)
end

Let me know if you need clarification on any of the details.

For more on Scan Aggregations, click here

For more on Streaming UDFs, click here

For more on Aggregations, click here

I hope this helps.


Help debugging Error: code: 100, message: 'UDF: Execution Error 1