Batch Operations


#1

Synopsys:

This article explains batch operations for reading multiple records from Aerospike database using a single API call from the client application, also provides details on fine tuning options, known limitations.

This article is for the legacy batch direct protocol. For the documentation, do refer to the batch guide. For the batch index protocol tuning, refer to the following batch index tuning article.

CRUD operations

Create, Read, Update, Delete operations are basic operations for any database and also for a Key Value store. Aerospike supports the following operations

1. Create: create a new record if it does not exist
2. Set: write record, create if it doesn't exist
3. Update: add/update existing records, specified bins
4. Replace: add/update existing records, replacing all bins 
5. Add: Add/Subtract integer value 
6. Append: Append value to string or byte array
7. Get: Read record
8. Delete: remove record

These CRUD operations could be performed through multiple utilities or API sets such as

  1. AQL:Aerospike Query Language, a command line utility providing simple SQL-like interface for database operations
  2. ASCLI: Aerospike command line interface providing basic database operations
  3. CLI: Aerospike command line interface providing basic database operations
  4. Client Libraries: provided in various popular programming languages such as Java, C#, Node.js, Python and many more

Possible ways of performing CRUD operations

  1. Single key: Allows performing CRUD operations on a single record at a time, using record key read, update, delete.
  2. Batch: Allows performing CRUD operations on multiple records with a single call from client application. Supports read, exists and read header operations based on record key based look up.
  3. Scan: Allows retrieving all records within a namespace, set. The operation doesn’t require client application to know the record key for the lookup.

Batch operations

Example

Lets assume the Aerospike database has a namespace user_records and has 10 million user records in this namespace. Each record has a key that maps to user_id. For an application specific use case the application would like to get all the user records who are friends of user A. It is assumed that the application knows the user ids for all the friends or they are stored in a different namespace/set that could be looked up.

The application could assemble all the keys (user_ids) 

    Key[] keys = new Key[size];
    for (int i = 0; i < size; i++) {
        keys[i] = new Key("test", "myset", (i + 1));
    }
The application could read tens of thousands of records in single application call, within one network round-trip
    Record[] records = client.get(policy, keys);

Note: all the records must be in same namespace

Assume the specific deployment of Aerospike is running a cluster of 8 nodes. Client could get all these records within a single call without worrying about data being distributed across multiple nodes of Aerospike. Cool, isn’t it?

How does this work

  1. Client application issues a batch get() call with mutliple keys to fetch these records.
  2. Client transaction thread creates multiple batches of this request based on where the data is available on the Aerospike cluster nodes.
  3. Client transaction thread push these multiple batch requests to the batch queue.
  4. Based on the batchPolicy configuration setting, the client could send these requests to Aerospike server serially or in parallel.
  5. On the server side, on each node, service-threads pick up the batch transactions, split them up into sub-transactions and send them down transaction-queues to be picked up by transaction threads (transaction-threads-per-queue.
  6. The batch-index-threads are responsible for gathering the sub-transactions results in 128KiB buffers (or 1MB depending on the record size).
  7. As Aerospike server returns the batch results, the client assembles these results and they are returned to the client application only when all the batch results have been returned by the Aerospike server (synchronous clients) or as buffers are received (asynchronous clients).

Note: For server versions 4.1.0.1 and above, a slow processing client or a long-running batch will not slow down the batch transactions that would be queued up on the batch-index thread that would be impacted. The statistic batch_index_delay would be incremented everytime such slow batch transaction is encountered, and warning message would be logged when the delay is above the allowed threshold (either twice the client total timeout or 30 seconds if the timeout is not set on the client). For older versions, the batch socket send timeout was hard-coded to 10 seconds, which meant that there could be a slow client or a huge batch bottle-necking an entire batch-index thread.

Types of batch operations

  1. Batch read: Read multiple records in a single call, for supplied multiple keys in batch read operation. The records returned could be complete records or only for the specified bins of the records. Batch read Java API, Batch read for specific bins Java API

  2. Batch exists: Returns array of boolean values showing existance or non-existence of the records. Batch exists Java API

  3. Batch read headers: Similar to batch read, but only returns the record meta data, such as TTL/expiration, generation. Does not return bins for the records. Batch read header Java API

Tuning batch operations

Batch operations could be fine tuned for the best possible throughput based on the application use case using the following parameters:

Server:

  • batch-threads: Number of batch worker threads. Default 4, Max 16.

  • batch-priority: Number of sequential commands executed on a node before yeilding. Higher the value would mean the batch job would finish sooner. Use this parameter carefully as the higher batchPriority could delay execution of other database operations. Default value is 200. The value could be set using runtime configuration parameter.

  • batch-max-requests: Max number of get/exist commands in a batch. Used to prevent unexpectedly large batch requests from causing server instability due to excessive memory consumption. Default value is 5000. The value could be set using runtime configuration parameter.

Client:

  • maxConcurrentThreads: Defines the number of threads used by client for executing the batch operation either serially or parallely. The parameter is set using BatchPolicy

    • maxConcurrentThreads = 1: Requests are executed sequentially from the client batch queue. The next batch request is submitted to server only after the earlier batch request has been completed.

    • maxConcurrentThreads = 0: Requests are executed parallely from the client batch queue, one request per Aerospike node in the cluster is submitted.

    • maxConcurrentThreads > 0: Client application defines how many parallel requests should be executed from the client batch queue. If the number of threads is lower than number of nodes in the cluster, the next set of requests from client batch queue will be processed once the first set has completed its execution. If the number of threads is more than number of nodes in the cluster, a certain number of threads might be unused during the batch operation.

Metrics:

Run the following command

asinfo -h <host ip> -v 'statistics' 
Default host 127.0.0.1 and port 3000 is assumed

Get batch related metrics
asinfo -v 'statistics' -l | grep batch
  • batch_initiate: Shows number of batch requests initiated.
  • batch_queue: Shows number of batch requests currently waiting in the batch queue, yet to be processed.
  • batch_timeout: Shows number of batch requests that have timed out, waiting for the response.
  • batch_errors: Shows number of errors resulted while processing batch requests.

Known limitations

  • Batch Operation during data migration: If batch query is running while the data migration is going on, the batch get, exists, getHeader may return less number of results.
  • Aerospike provides multiple client libraries such as C, Java, C#, Node.js and more. Batch operations in each client library may have slight different implementations. Refer to specific client API reference documentation for each client library.

Keywords

batch statistic

Timestamp

05/10/2018


Prole reads in case of batched requests