Read/write UDF

Hi folks,

I’m trying to write a UDF that modifies data (e.g., delete record) and return results (e.g., compute sum of selected columns of the record). Since I need to do this over a range of rows, I believe I need to use a Streaming UDF. This is because I want to return the sum captured over all rows selected.

For example, say my set has two bins: metaSize and callSize. I want to run the UDF over a range of rows, where we sum metaSize and callSize of each row, delete the row, and eventually return the sum of metaSize and callSize from all rows.

Is this possible to do? The manual says “The stream-UDF is read-only”. So it seems I cannot issue deletes. Can I achieve my goal without stream UDFs?

Thanks.

You can use a stream udf as you describe for read-only operations with a query or scan, but not a batch I believe, and later delete the rows in a separate operation. For a query, scan, or batch, you can use “operate” to read the two columns and delete the record for the specified set of records. The returned columns of each record can be summed up, as well as sums from multiple rows added to get the aggregate sum in the client (application). If you are using 5.6 or later, you can also use “operation expressions” to retrieve the sum of two columns from the server, and perform the aggregate sum in the client.