FAQ What is the Quantum Interval and how does it affect cluster reformation time?

FAQ What is the Quantum Interval and how does it affect cluster reformation time?

Detail

In Aerospike logs there are messages that reference the Quantum Interval such as the one shown below.

Oct 24 2018 17:15:10 GMT: INFO (fabric): (fabric.c:2516) fabric: node a1 departed
Oct 24 2018 17:15:10 GMT: INFO (fabric): (fabric.c:934) fabric_node_disconnect(a1)
Oct 24 2018 17:15:11 GMT: INFO (clustering): (clustering.c:6823) dead nodes at quantum start: a1

What is the Quantum Interval and how does it affect cluster reformation time?

Answer

The original Aerospike clustering algorithm would react immediately to any changes in cluster membership with a priority on expediency of cluster reformation. Cluster reformation, by implication, means that partitions may be rebalanced and migrations between nodes will take place (see note). In larger clusters and in modern cloud environments it became clear that this was not optimal. The initial assumption, that severe network glitches would be isolated and short lived, proved to be incorrect in the environments in which Aerospike was deployed. A new clustering protocol was developed to address the reality of the Aerospike operating environment.

A key part of this is the quantum interval. The Aerospike cluster protocol implements a quantum-based event detection allowing for multiple complex network changes to be merged into a single event which reduces the number of state transitions. Cluster state transitions are expensive in terms of time and resources and so it follows logically that they should be minimised where possible. The purpose of the quantum interval is to avoid reacting too quickly to node arrival and departure events and instead, process a batch of adjacent node events with a single cluster epoch change. This avoids superfluous overhead caused by redundant data re-distribution.

The system collects all clustering trigger events such as node arrivals and departures and only acts on these once every quantum interval. With the default server settings, the quantum interval is equal to 1.8 seconds though it can be adjusted as described below. The quantum interval is computed based on the configured expected network latency between cluster nodes, heartbeat interval and heartbeat timeout, so that all the effects from a single disruption such as node arrival, departure, network link faults or network partitions are seen by all nodes in the cluster before a cluster epoch change is applied.

Important Note: The following examples are provided as indication only for the theoretical scenario of a single node unexpectedly leaving a cluster without any other disruption across the rest of the cluster. This is not necessarily a likely common real scenario as disruptions in distributed systems are seldom isolated.

Key Factors:

  • latency-max-ms Static parameter set to define network latency. Too low a value can cause excessive rebalance in the event of an unstable cluster. Default is 5ms.
  • HBT= (heartbeat.timeout * heartbeat.interval) = 1500ms or 1.5s in default configuration.
  • D = 2 * (heartbeat.interval + latency-max-ms). Defaults to 310ms.
  • Q = Quantum interval which is min(5000,HBT + D). Default 1810ms, capped at 5000ms.
  • RTT = 2 * latency-max-ms. Represents maximum round trip latency. Defaults to 10ms.
  • C = 5*RTTfor clustering messages + 2 * RTT for exchange (time to reform cluster). Defaults to 70ms.

Cluster with default configuration

Best case clustering reformation after a node dies

HBT + D + (Q/4) + C := 1500 + 310 + (1810/4) + 70 = 2332ms

Worst case clustering reformation after a node dies

HBT + D + (Q/4) + C + 1 RTT + (Q/2) := 1500 + 310 + (1810/4) + 70 + 10 + (1810/2) = 3247ms

Cluster with typical multi-site configuration

When considering a cluster with a higher intra-node network latency the values change significantly. The following are calculations for a cluster where latency-max-ms has been set to 70ms. This would represent a cluster whereby nodes reside on either side of the United States for example. In that scenario, heartbeat.timeout could be set to 25 and heartbeat.interval to 100ms (those values could differ based on the nature of the cluster, for example cloud vs. baremetal and whether one would prefer more frequent heartbeats with a lower tolerance for missing subsequent ones or vice versa).

Best case clustering reformation after a node dies

HBT + D + (Q/4) + C := 2500 + 340 + (2840/4) + 980 = 4530ms

Worst case clustering reformation after a node dies

HBT + D + (Q/4) + C + 1 RTT + (Q/2) := 2500 + 340 + (2840/4) + 980 + 140 + (2840/2) = 6090ms

Notes

  • A rebalance will always take place following a cluster event however the migrate-fill-delay command can be used to delay subsequent fill migrations if this is appropriate to the use case.
  • Quantum interval is inferred from latency-max-ms.
  • latency-max-ms is a static parameter that requires a rolling restart to change.

Keywords

QUANTUM INTERVAL CLUSTER EVENT MIGRATE REBALANCE

Timestamp

June 2020

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.