Search Keys by regex match

index
secondary
query

#1

Hello all, I’m new to aerospike, originally we’re using redis as distributed cache solution but it has a poor cluster solution becuse of single threaded architecture. So our Director wanted to try aerospike. Our application has a DeleteByRegex method which deletes items in cache by matching keys by given regex. I must implement this behaviour for Aerospike. I read about “scan method” and “creating a second bin with key data then create secondary indexes on it” but I’m not sure about it. So how can I find keys that match by given regex? then using this keys to delete items.

Additional data: We are using aerospike on RAM and we don’t need to persist it to a file. We must use cluster architecture (which is cool in aerospike) DeleteByRegex is one of our core methods so i need to implement it.

Thanks for your time & help

EDIT: I want to give an example I have following key values

  • p_1516_price_usd -> 15
  • p_1516_price_try -> 26
  • p_1516_shortDescription_en -> bla bla in English
  • p_1516_shortDescription_es -> bla bla in Spanish

Sometimes I want to delete all info about a product so I match for “p_1516*”

Sometimes Price will be enough so Imatch for “p_1516_price*”

Thats my case.


#2

Here is code that scans and deletes keys. The put must set “sendKey” policy to true in order for the user keys to come back on the scan. The deletes are performed with a thread pool which improves performance.

using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
using Aerospike.Client;

namespace Test
{
    public class KeyScanner
    {
        AerospikeClient client;
        string ns;
        string set;

        public KeyScanner(AerospikeClient client, string ns, string set)
        {
            this.client = client;
            this.ns = ns;
            this.set = set;
        }

        public void WriteKeys()
        {
            WritePolicy policy = new WritePolicy();
            policy.sendKey = true;

            for (int i = 0; i < 100; i++)
            {
                client.Put(policy, new Key(ns, set, "Key" + i), new Bin("x", 0));
            }
        }

        public void ScanKeys()
        {
            ScanPolicy policy = new ScanPolicy();
            policy.includeBinData = false;
            policy.concurrentNodes = false;
            policy.priority = Priority.LOW;

            client.ScanAll(policy, ns, set, ScanCallback);
        }

        public void ScanCallback(Key key, Record record)
        {
            if (key.userKey != null)
            {
                string keyString = key.userKey.ToString();

                // Determine if key should be deleted here.
                bool delete = true;

                if (delete)
                {
                    ThreadPool.QueueUserWorkItem(Delete, key);
                }
            }
        }

        private void Delete(object obj)
        {
            client.Delete(null, (Key)obj);
        }
    }
}

#3

Thank you Brian, you saved my day.

With a minor change on your reply, I was able to delete by regex. I’m sharing my changes incase somebody needs it;

EDIT: I changed the regex part as manigandham suggested, thanks to him also.

First define the regex in the caling method.

var regex = new Regex(pattern, RegexOptions.Singleline | RegexOptions.IgnoreCase);

then Instead of

client.ScanAll(policy, ns, set, ScanCallback);

in ScanKeys Method. I used lambda to extend the parameters count, which in my case needed for the pattern.

client.ScanAll(policy, ns, set, (key,record)=>DeleteCallback(key,record,regex));

Then in the ScanCallback method (renamed as DeleteCallback) , I match against the pattern;

    private void DeleteCallback(Key key, Record record, Regex regex)
    {
        if (key.userKey != null)
        {   
            string keyString = key.userKey.ToString();

            // Determine if key should be deleted here.
            if (regex.IsMatch(keyString))
            {
                ThreadPool.QueueUserWorkItem(Delete, key);
            }
        }
    }

Thank you.


#4

You should just pass in the Regex object rather than a string pattern.

The DeleteCallback method will be called for every single key in the database for the scan operation and you’re currently creating a new Regex object every single time. It’s also far slower to create if you’re passing in the RegexOptions.Compiled flag, that should only be used if you’re keeping the object around for reuse, as recommended.