Hi, I am going to use UDF to perform some atomic READ/WRITE operations because it looks more robust to perform it in data layer than in application layer using optimistic locks with CAS. What should I know before accepting this design? Are there any limitations of using UDFs? In terms of memory used or anything else? I have seen a topic that UDF crashed an entire cluster in Google Cloud How my UDF crashed Aerospike. Could you please share any live experience or recommendations of using UDFs with respect to reliability and performance?
Did you consider using the operate command to execute multiple operations on the same record? From what you’re describing that’s what you’re trying to do. A record UDF gets a lock and all operations described in the function occur on that same record. Similarly, a ‘multi op’ gets a lock on the record, and executes multiple operations in sequence on it, with the entire set of operations rolling back on failure. It’s similar to a transaction in an RDBMS, just on a single record.
I’m using the Python client’s method as an example, but the same method exists in the other clients as well.
The operate command would actually be faster and more scalable than the equivalent record UDF.
Operate() allows you to do a list of operations on a record (modifications) and finally read it back to the client in its final form in one trip from the client to the server.
However, you cannot do “if then else” type of logic on the record operations based on the data in the record in Operate(). If you want to read, then based on what is read, apply some logic and take action 1 or action 2 or action 3 on the record, then you can either use CAS or UDF.
With Operate() you get a read of the record in the same lock because Operate() has a return value in which you can return the “record”.
Hi, thanks for the reply. I haven’t considered aerospike Operations because they have only some primitive functions like add/append/etc. I am going to use CDT and perform atomic Read-Modify-Write operations. UDF fits much better to my requirements, but it is not a single way. That’s why I need some feedback about its reliability.
Also, the crash link you mention is from Jul 2015, plus it is on stream UDF for aggregation, what you are trying to use is RecordUDF.
Stream UDFs operate on a set of records in read only mode and can be used to extract and aggregate data from the records - sum/average etc. In my testing of Stream UDFs in recent past, I have not seen server crash or anything like that. Hard to comment on what went wrong in that test.
Record UDFs operate on a single record and can modify, delete, update that record.
What if my UDF function requires a lot of time for execution, could it theoretically breed problems related to utilization of the thread pool or something like that? Roughly speaking what is the scope of propagation of the problem related to Record UDF? Are they isolated within a namespace ? I am curious about it because an opportunity to execute some code within your cluster might be quite dangerous.
For a record UDF, when operating on a record, the lock will be on that record which is isolated to the node on which the record is, the namespace that record is in and the partition that record is in. There are 4K partitions.
Ver 3.11 has introduced further refinement of the partition into sprigs (http://www.aerospike.com/docs/reference/configuration#partition-tree-sprigs) which will improve performance significantly.
This page has some discussion on assessing UDF performance: http://www.aerospike.com/docs/operations/manage/udfs