Is the execute(policy, statement, ...) Java API call really async?


#1

Hi,

according to the Java API documentation, the following method is asynchronous:

public final ExecuteTask execute(WritePolicy policy, Statement statement, String packageName, String functionName, Value... functionArgs) throws AerospikeException

the documentation says: “Records are not returned to the client. This asynchronous server call…

I created an application using it and I have also added some simple method call timers, so now I see how much time a method call takes. From the timer’s log and from the Java API it seems to me that this call is not async, but a simple synchronous call. The Java API uses a ServerCommand class to execute the command, this class’s second parent class is a SyncCommand, not an AsyncCommand.

Please let me know if my findings are correct and if an asynchronous version of this method is planned to be developed.

Thanks, Peter


#2

Hi Peter

What you are using is a User Defined Function (UDF) that is applied to all records that match the criteria specified by the Statement. Essentially this UDF will be invoked on each record matching the criteria. This is often used when data in a Set or Namespace needs to be transformed to another form.

You are correct, invoking execute() is a synchronous call sent to each node in the cluster in parallel. But the processing of records will continue until the criteria is exhausted, well after the execute() has completed in your client. .

Consider the case where you have 2,000,000,000 records that are evenly distributed across 4 nodes in an Aerospike cluster. The execute() method will send a message to each node with the criteria and the UDF module and name, and then return. Each node, however, will continue to traverse the 500,000,000 records (2,000,000,000 / 4) assigned to it, invoking the UDF on each one. This job is running asynchronously from the client’s perspective.

I am interested to know what you are trying to achieve, are you trying to find a way to invoke a Record UDF asynchronously and then return a result to your application? If so, we don’t have an execute() method in the AsyncClient today, but we would be interested to know you needs so we can add one at a future time.

I hope this helps

Regards

Peter


#3

Our context is:

  • we have an application, that uses the put() method to write into the database and we use an UDF method to delete some of the data (it needs some transactional behaviour, that is why we use UDF)
  • this application is based on actor, so it is essential that the actors finish their task as soon as possible
  • in the first version of the app we used synchronous put and execute (the other one that uses a key) methods
  • we observed generally good response times (mean around 20-50ms), but there were also slow calls, the maximum response time counted for the last 4-5 minutes were always at about 200-300ms and it sometimes topped at about 1400ms.
  • As we used the synchronous calls, the application were blocked for too much time, that we want to prevent.

These are the reasons why I tried to use the async versions of these calls.

The async put is very easy to use and after reading the javadoc I though that that execute method will be similiarly asynchronous. But after implementing these to, I measure 0ms method call time for the async put, but the execute is still producing the 200ms maximums (the maximum of these time-framed maximums was 900ms so far), so it does not solved my problem.

If I understand right, the execute(policy, statment, ...) method is asynchronous in a different way, than the put method. I suppose the put() method returns before communicating with the server, and the execute() method returns just later, maybe after the filtering is done based on the statement, or something similiar. Anyway, my goal is to fire the method that should return asap and it is fine for me if the Java client completes the command on an other thread, until it does not block my application.

Regards, Peter


#4

Hi Peter

Your response times are about 100 times worse than most primary key operations. Are your operations mostly primary key operations or mostly secondary index queries?


#5

Hi,

we have two operations:

  1. Write one bin in a record
  2. Run a simple UDF on a record to delete one of its bin or delete the complete record (based only on the record’s data)

The first is a put operation and the second is an execute, that uses one key to select the record that the UDF should work on. (Previously I wrote that we used the execute method with the statement to select the records, that is true, we tried both without much difference in the results).

Both the synchronous put or the execute methods provide about the same behaviour described above.

The database’s work load’s half is about the put method and the other half is the UDF. So if I use only the execute with the key parameter, then we do not even use secondary indexes (although it may still be defined in the system).

Regards, Peter


#6

Hi Peter,

I think you have it correct. 1) should be a standard put() operation and 2) can be handled via a UDF using execute().

Regards Peter