Understanding Timeout and Retry policies


#1

Summary

This article covers the timeout and retry API’s available in Aerospike clients.

This article additionally covers the different fields that are seen when the Java client sees a timeout on a transaction.

Configurations available from Client side

The client’s policy configuration options relative to timeout and retries have been updated in version 4.0.0 (Java) with some further iterations in subsequent versions as well. This document describes the state as of version 4.1.1 of the Java client which should be consistent with the C (4.3.1) and C# (3.5.2) client released in late 2017.

Note: For TLS enabled cluster, Java sync clients versions 4.1.7 and above will honor the socket timeout. Older versions do not support specifying a socket timeout and will hang upon failing to establish a handshake with a cluster note.

For the latest Java client versions:

socketTimeout

  • This is the socket idle timeout. It controls how long of a gap can occur in between activity on the socket.
  • This helps as there can be activity on a socket for a very long time with gaps that are just below this value.
  • This configuration applies to both the Sync and Async Java clients.
  • Retries will be applied as long as totalTimeout or maxRetries have not been reached.

totalTimeout

  • Absolute fixed upper limit time given for the transaction to complete before an exception is sent back.
  • If reached, there would be no retries.

timeoutDelay

  • This is available in the Java SYNC client only as of version 4.2.2 (otherwise available in the Java ASYNC client).
  • In situations where a transaction reaches totalTimeout, an error will be returned but the socket will not be closed until this delay is reached.
  • This gives the advantage to potentially reuse the socket (by putting the connection back in the pool) if a response came back within this extra delay, even though a timeout was already sent to the client application.
  • Note that the response from the server will anyways be thrown away since we already responded to the client.

maxRetries

  • In situations where retries are configured, this is the absolute max number of retries to attempt.
  • The initial attempt is not counted as a retry.
  • For write transactions, the default is 0. This is because we do not recommend writes to be applied twice.
  • For read transactions, the default is 2 (which means that 3 total attempts would be made at most).
  • For read transactions, this would also depend on the replica mode set. If set to default (sequence) - the first attempt would be against the master copy, and in case of a timeout or network error, the subsequent attempt will be against the next replica copy. Note that the two options available for sequence are ‘sequence’ or ‘master’.
  • For C, even if replication factor is 3, reads will stop at first replica and come back to master copy.
  • For Java / C#, reads will go to all the replica copies and then come back to the master copy.
  • For write, in case of a connection error, it will have the same behavior as read. But in case of a socket timeout, it will stay on master as the default is not to retry at all.

sleepBetweenRetries

  • This is available only for the Sync Java client. The Async Java client will never sleep between retries.
  • This configuration ensures that if a transaction is retried, there will be a sleep before it retries.
  • If configured to 0, there will not be any sleep. The default is set 0.

Examples

  1. Assuming socketTimeout = 50ms, totalTimeout = 1s, maxRetries = 3, sleepBetweenRetries = 20ms Ones the transaction is initiated, let’s assume that there was no activity on the socket for the next 50ms. The socketTimeout would then trigger but since the total time taken is still below the totalTimeout configured, the client retries after waiting for 20ms(sleepBetweenRetries) and then initiates its first retry. Thus at this point of the first retry, the timeline has progressed by “50ms + 20ms = 70ms” since the start of the transaction. Similarly, if the socket timeout occurs again the next retry would occur at: “50ms + 20ms = 70ms” from the last retry i.e. at 140ms. Further, if the situation continues then the last i.e 3rd retry is attempted at 210ms since the total time taken would still be below the totalTimeout configured.

  2. Assuming socketTimeout = 100ms, totalTimeout = 300ms, maxRetries = 3, sleepBetweenRetries = 100ms In this case, once the transaction is initiated, if there is no activity on the socket for 100ms, the first retry would be attempted after 200ms. The second retry is supposed to happen at 400ms but since totalTimeout is set to 300ms it is not attempted and the transaction will timeout after 1 retry (2 total attempts).

For Java client versions prior to 4.0.0

To keep old timeout behavior, set socketTimeout equal to totalTimeout.

Description for the timeout error

For client versions 4.0.0 and above

The timeout exception also provides information whether it was a client or a server timeout (defaults to 1 sec configurable by transaction-max-ms).

For Java client versions prior to 4.0.0

Exception in thread "main" com.aerospike.client.AerospikeException$Timeout: Client timeout: timeout=0 iterations=M failedNodes=N failedConns=X at com.aerospike.client.command.SyncCommand.execute(SyncCommand.java:131)

Timeouts can occur for the following reasons:

  1. Client can’t connect by specified timeout (timeout=). Timeout of zero means that there is no timeout set.

  2. Client does not receive response by specified timeout (timeout=).

  3. Server times out the transaction during it’s own processing (default of 1 second if client doesn’t specify timeout). To investigate this, confirm that the server transaction latencies are not the bottleneck.

  4. Client times out after M iterations of retries when there was no error due to a failed node or a failed connection.

  5. Client can’t obtain a valid node after N retries (where retries are set from your client).

  6. Client can’t obtain a valid connection after X retries. The retry count is usually the limiting factor, not the timeout value. The reasoning is that if you can’t get a connection after R retries, you never will, so just timeout early.

Examples

  1. In WritePolicy, if you have maxRetries(3), sleepBetweenRetries(500ms) and timeout(0), you will try the operation 4 times, with a 500ms wait between each try. If you have not been successful after 2 seconds you will get a timeout error back.

  2. Timeout trumps retries. If you set a timeout of 50ms (rather than zero) and the operation has not completed in that time, you will get an exception regardless of the number of retries unless the retryOnTimeout is set to true

  3. Consider timeout = 1s, sleepBetweenRetries = 300ms, retries = 3, retryOnTimeout = true. In this case, if the first transaction attempt times out, then 3 more attempts are made with 300ms between each retry. After all the retries if the transaction fails then a timeout is returned and it has to be handled at the application level.

Parameters that can be configured on Server side:

Definition: How long to wait for success, in milliseconds before timing out a transaction. This parameter comes into effect when the client has not specified transaction timeout or totalTimeout.

The transaction-max-ms (or, if specified, the client set timeout) gets checked in 3 different places:

  1. when a transaction is picked up from the transaction queue.
  2. every 130ms when waiting in the rw-hash (see rw_in_progress).
  3. every 75ms when waiting in the proxy-hash (see proxy_in_progress)

Definition: How long to wait for success, in milliseconds, before retrying a fabric transaction (typically a write prole or a duplicate resolution).

  • Examples

If a client specifies a totalTimeout of 5 seconds, assuming there are network issues preventing a write to be processed on its prole side, the fabric transaction would be retried up to 2 times, with an interval starting at 1 second (default transaction-retry-ms) and doubled for every subsequent retry i.e as long as totalTimeout is not reached. If totalTimeout is set to 0 by the client, then transaction-max-ms will be honored in-place of totalTimeout in the above example.

Errors for which client does retry (if maxRetries configured) and for which it doesn’t:

During the send command, the client will retry for any error it receives (if maxRetries configured). Once it sends the command to server and gets response from the server, it retries (if maxRetries configured) for errors like:

  • socket_timeout
  • AEROSPIKE_ERR_CONNECTION
  • AEROSPIKE_ERR_TIMEOUT
  • AEROSPIKE_ERR_RECORD_BUSY
  • AEROSPIKE_ERR_FAIL_FORBIDDEN
  • etc.

The client will strictly not retry for the following errors:

  • AEROSPIKE_NOT_AUTHENTICATED
  • AEROSPIKE_ERR_TLS_ERROR
  • AEROSPIKE_ERR_QUERY_ABORTED
  • AEROSPIKE_ERR_SCAN_ABORTED
  • AEROSPIKE_ERR_CLIENT_ABORT
  • AEROSPIKE_ERR_CLIENT

Client’s IN_DOUBT state (for writes):

If the client is in doubt state, a flag is set to indicate that is possible that the write transaction may have completed even though an exception was generated. This is specific to timeout errors (AEROSPIKE_ERR_TIMEOUT) and client specific errors (and not based on server’s response).

Notes

Keywords

timeout retry socket retries

Timestamp

06/12/2018


Handling node failure on client
FAQ - General questions around transaction handling in Aerospike during cluster size changes
Scan Query : com.aerospike.client.AerospikeException: Error Code 4: Parameter error
#2

Hi,

I have one question.

Of all timeout scenarios that you mentioned above which circumstances could I be absolute certain that the final result of the transaction is FAILED?

In the worst case, If I could’t be certain about the final result, how would I be able to query the final state of the transaction?

I tried searching many places but haven’t found an appropriate answer.

Thanks in advance.


#3

Edit: We came up with a temporary solution:

Keep a map of [generation -> value read] for that record (maybe a background thread constantly reading the record etc.) and then on timeouts, we would periodically check the map (key = the generation expected) to see if the true written value is actually the one put to the map. If they are the same, it means the write succeeded, otherwise it means the write failed.

Do you guys think it’s necessary to do this? Or are there other ways?

Thanks.


#4

Posted on Stack Overflow as well. If you are an Enterprise Licensee, do not hesitate to reach out (or have someone reach out) through support so we can help you with the details for your specific use case.


#5

Thanks a lot. We are currently experimenting the database and will consider switching to the Enterprise version once we’ve finished our experimentation.


#6

I previously described a solution here:

Note that this solution is specific to a counter, if you need read/modify/write then change step 2 (getHeader) to retrieve the full record.