You can use a Record UDF and move your logic to server node. In the UDF, use the exists() api to check if the record exists, if it doesn’t, call aerospike:create() else call aerospike:update(). (Pass the k argument to the record udf. ) This will allow you to create/update in one i/o hit from the client to the server.
function create_or_update_record (rec,bin1,bin2)
if( not aerospike:exists( rec ) ) then
rec['bin1']=bin1
rec['bin2']=bin2
return aerospike:create(rec);
else
rec['bin1']=bin1
rec['bin2']=bin2
record.set_ttl(rec, -2)
return aerospike:update(rec);
end
end
@Albot In your function, you could move the bin assignments before and outside the if conditional loop. but for this topic,
we can do:
function create_or_update_record (rec, k)
if( not aerospike:exists( rec ) ) then
rec['bin1']= 1
return aerospike:create(rec);
else
rec['bin1']= k
record.set_ttl(rec, -2)
return aerospike:update(rec);
end
end
Note: If I invoke this UDF as a background scan on all the records in namespace and set, I don’t think create() will be called. There is no key to call it on then … so it will only do the update() when applicable.
aql> EXECUTE c_or_u.create_or_update_record(2,4) ON test.demo
Thanks a lot Team for support. UDF will serve this purpose. In my code, you can see that I am using increment and get using operate function. Is this still possible using UDF? If I get and return the value immediately after update will it work assuming UDF is atomic for that record.
You cannot return a “record” type from a UDF. However you can return a bin value or a list of values of multiple bins or a map type key-value pairs from a UDF.
All ops inside the UDF are within the same record lock - so atomic. This is the benefit of using UDFs. However, if this is high throughput need, you must test and ensure you are getting the desired performance.
You have a unique implementation that even though it is a mutli-i/o operation, I could not see a scenario where you could be inconsistent. Even if your record expired between the CREATE_ONLY and UPDATE_ONLY calls and you miss the UPDATE_ONLY also, it would not matter because it would be close to expiration regardless after the UPDATE_ONLY had it succeeded. In other cases, you can use Generation policy equal to do a read-modify-write or Check-And-Set with optimistic locking.
So, in your use case, your code or the UDF method, both are viable - test and see which gives you better performance.
BTW, care to share what is your use case? ie what are you trying to achieve with this kind of code?
Hi Gupta,
Thanks for your detailed explanation. I have not benchmarked the UDF implementation. Will do it shortly.
My primary query in using UDF is, is it safe to use in production having 20mn hits an hour?
Apart from this, below is my usecase for my code snippet.
I will be using this for frequency capping. Within a given frame of time, I will be able to push events only upto predefined number of times. Sending events more than once will be less than 10% of the total hits. Hence given preference for insert instead of update.
Whenever it goes for update, I see the latency is doubled as it goes for put, and on exception I am doing the Operate. It would be good if there is a policy to not touch the ttl if record already exists which saves one round trip.
If you are using for frequency capping and don’t need your data to persist, data type is integer, use single-bin record namespace with data-in-memory and data-in-index. You can store just these frequency capping implementation records in this namespace. Rest of your other persistent data you can put in a second namespace.
Your write rate - 20M /hour is about 5K per second which should be very doable - assuming you have adequate end to end hardware and network.
This will set default ttl of record during first write, and then it’ll not disturb ttl on updation.
Aerospike Version: 4.2.0.5
Java Client Version: 4.1.1