How To - Implement Rack Aware reads in practice

How To: Implement Rack Aware reads in practice

Context

In many cloud environments, it is best practice to build a cluster which spans multiple availability zones. Aerospike’s clustering model is a good fit when used in the rack-aware deployment mode as racks can be sited in different availability zones.

Aerospike provides a mechanism to have a database clients read on a preferential basis from servers in their rack / zone. This means that reads could be served by one of the non master replica partition causing different clients to potentially read different values for the same record while a write transaction is in flight between the master partition and its replicas. For use cases that would tolerate this, reading from the closest rack to a given client can significantly reduce traffic charges by limiting cross-AZ traffic as well as provide lower latency and increased stability.

The feature is available for both AP and Strong Consistency modes in Java, C, C#, Go and Python.

Please see aerospike client matrix for full compatibility details.

Method

Setting up rack aware reads

Clusters need to be set up in logical racks. To do this for AP namespaces consult Rack Aware for AP. For SC namespaces see Rack aware for SC.

In this example, Java is used to demonstrate how to enable rack awareness. Operations are similar in other clients.

Two flags, rackId and rackAware must be set in the ClientPolicy object, which is used when initializing the Aerospike client.

ClientPolicy clientPolicy = new ClientPolicy();
clientPolicy.rackId = <<rack id>>;
clientPolicy.rackAware = true;

The rack id specified should be the one specified in the nodes for that associated AZ where that application is running. To avoid hard-coding this could be obtained dynamically via the cloud providers API, or set in a local property file. rackAware is used to indicate that the client should make use of rack awareness features.

Once the application has connected, reads which are to be rack-aware need to set 2 additional parameters in the Policy associated with the read.

AP Mode

Policy policy = new Policy();
policy.readModeAP = ReadModeAP.ALL;
policy.replica = Replica.PREFER_RACK;

readModeAP.ALL indicates that all replicas can be consulted. policy.replica = Replica.PREFER_RACK indicates that the record in the same rack should be accessed if possible.

SC Mode

Policy policy = new Policy();
policy.readModeSC = ReadModeSC.ALLOW_REPLICA;
policy.replica = Replica.PREFER_RACK;
  • readModeSC.ALLOW_REPLICA indicates that all replicas can be consulted.
  • policy.replica = Replica.PREFER_RACK indicates that the record in the same rack should be accessed if possible.

Notes

  • For policy.readModeSC other options are available - consult ReadModeSC for full details. Similarly other options in addition for policy.replica to PREFER_RACK are available and replica options should be consulted for full details.
  • These parameters are additional to any other policy flags that are being used, such as timeouts and retry intervals. In general, for reads it is often a good idea to set the reads to timeout after a reasonable duration (eg 50ms) and have Aerospike automatically retry when this timeout happens. In this case, the retry will typically try on a different node to the first read. This gives more resilient reads in the face of a node failure — if the node the client is reading from fails during the read, the retry will try on a different node (in a different AZ with rack awareness set up) which will most likely still be there.
  • Whether clients are rack aware or not, writes will always go to the node holding the master partition (except for some short lived transient situations). This should be taken into consideration when considering clusters where racks are separated to a significant degree.

Keywords

RACK AWARE CLIENT READ CROSS AZ SC AP

Timestamp

April 2020

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.