Solution - Client exceptions when removing an Aerospike node from a cluster

Solution: Client exceptions when removing an Aerospike node from a cluster

Problem Description

When taking down an Aerospike node for routine maintenance the following exceptions are seen in the Aerospike client logs (Java client used as an example here):

com.aerospike.client.AerospikeException: Error Code -1: java.net.ConnectException: Connection refused

What causes this error and how can it be prevented?

Explanation

So that the clients can go to the right node for a given partition, each client maintains a partition map which lists which node owns which partitions. Every second, the clients tend to the node and get the up to date list of the partitions each node owns.

When a node is taken out of the cluster (i.e. the asd process is stopped) the clients will still try and connect to that node until their partition map is refreshed.

The cluster will re-form once it realizes a node has departed. The cluster determines that a node has left by counting missing heartbeats. Each node will send a heartbeat to other nodes in the cluster at the interval defined in the Aerospike configuration file, this is measured in ms and defaults to 150.

The cluster will allow a number of missing heartbeats, defined by the timeout configuration, which defaults to 10 before declaring the node dead.

Finally, there is 20% overhead called the quantum interval to optimize for potential multiple rapid cluster changes.

Therefore, in default configuration, this implies a 1.8 second – (150ms * 10 + (150 * 10) * 20%) – interval before a cluster re-forms after one or more nodes are not heard from. At that point, a new partition distribution is set which the clients gradually discover as they tend each node every second (default tend interval). Typically, clients on average would get their partition map updates within a couple of seconds (up to 2.8 to be precise) with all settings at their default.

Solution

In Aerospike 4.3.1 and higher the quiesce feature was introduced. When a node is marked for quiescence nothing happens immediatly, however, on issuing a recluster command, the node gives up ownership of its partitions. Those will be taken over by a new node (typically the next node in the succession list (i.e. the node that owned the replica partition prior to the quiesced node leaving), unless rack aware or uniform balance is configured, in which case, it could be a different node taking over.

Before clients tend and build their new partition maps, the quiesced node will proxy any transactions that it receives to the new master node. When the clients have built their new partition map and are no longer directing transactions towards the quiesced node it can be shut down safely without any disruption to client traffic.

However, a quiesced node will still be in the cluster until it is shut down. Client will still be tending such node and when the node is shut down, the tend calls would fail until the cluster re-forms again without the node (but as the node was already quiesced, partition ownership will not change at that time).

Keywords

QUIESCE CLIENT EXCEPTION TIMEOUT

Timestamp

25 June 2019

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.