How to delete a record based on secondary index?

record
java
deletion

#1

Hi,

I’m trying to delete a record based on a secondary index value. I’m using java API and the only thing it supports right now seemed to be based on PK. Here’s an example on what I’m trying to do :smile:

Following are sample bins in the dataset.

content, node_id .

I’ve a secondary index on node_id. All I want is to delete a record based on node_id. Do I need to first retrieve the record and then delete based on the PK ?

Any pointers will be appreciated.

Thanks, Shamik


#2

You’re right, since Aerospike is a key/value store the main record operations require the primary key of the record. To delete a bunch of keys, you’ll need to get the keys first in 2 main ways:

If you don’t have lots of records, don’t need super-high throughput or want to have this delete logic in your application:

You can run a query using the secondary index and just have it bring back the keys (without the bins/data), then immediately call a delete on each key. Using async methods you can probably get somewhere between 20-60k transactions per second per core with this, depending on your bandwidth, latency, namespace config, etc.

If you want max throughput and just 1 call:

Create a query UDF that filters on the bin value then calls the delete method on each key. This is basically doing the above but in the Lua UDF code and running it on the server via a single call from your app. It’ll run as fast as possible.


#3

Thanks for the pointers. Just a followup question on your first solution. You mentioned that we can bring back key based on secondary index. Is it possible to construct a single query with multiple secondary index ? For e.g. if I’ve a list of node_id, can I retrieve the list of keys with something like where node_id IN ( 123, 456, 789) ? That’ll allow me to get list of keys in a single call where I need to delete a bunch of records.


#4

Aerospike supports searching for exact value for string bins and exact values or range (min,max) for numeric bins. However it doesn’t support using multiple filters in a single query through the client drivers right now.

For that you’ll have to write your own Lua UDF that checks for multiple separate values, but then you can’t use a secondary index because the UDF will have to go through all the records in the set. Also if you’re using an UDF to filter then you can just use the UDF to do the deletion too (which is the second method described above).

I think the best option for you is to just combine the two methods a big and create a basic delete Lua UDF that can be called via the client in a query. This way it’s just 3 separate calls (for 123, 456, 789) and it’ll use the secondary index to only pass those records to the delete function.


Here’s a quick walkthrough

Save this as delete.lua file:

function deleteRecord(r)
    aerospike:remove(r)
end

Then register it in your cluster:

RegisterTask task = client.register(null, "C:/dev/udf/delete.lua", "delete.lua", Language.LUA);
task.waitTillComplete();

Then run the UDF through a query - this will use the index on the bin and send all matching records to that UDF which just immediately deletes the record:

Statement stmt = new Statement();
stmt.setNamespace("yourNamespace");
stmt.setSetName("yourSet");
stmt.setFilters(Filter.equal("node_id", 123));

ExecuteTask task = client.execute(defaultPolicy, stmt, "delete", "deleteRecord");