Automatic failover to other nodes?

mlove-au · July 27, 2016, 12:35pm

Hi,

Does the C client support automatically switching to other nodes when one node goes offline? we are testing aerospike with a 3 node docker cluster (replication factor two) and when killing one node our client operations start to fail. For example an aerospike_key_put fails with AEROSPIKE_ERR_CLIENT (func: as_socket_read_limit, file: “src/main/aerospike/as_socket.c”, line: 442).

Brian · July 27, 2016, 5:29pm

The C client does support automatic node switching on node failures, but there is a lag (usually 1 - 2 seconds) between node failure and the client dropping that node from the cluster map. During this lag, transactions will continue to be sent to the downed node.

Each client instance periodically polls all nodes for cluster status at default 1 second intervals. When a node goes down, the next cluster status request should result in the node being dropped from the map. The client strictly follows this map when determining transaction destination.

Immediately switching nodes on a transaction timeout is bad idea for a number of reasons.

The client wouldn’t know which node to send the transaction because the new node for that transaction hasn’t been decided yet. This would result in lots of proxies in an already stressed system.
Timeouts can be relatively frequent for applications that must respond by a fixed time.
The client’s view of the cluster map would operate much differently than the server’s cluster map.

Topic		Replies	Views
Java Client - Automatic failover to other nodes Java Client	3	910	May 3, 2021
Handling node failure on client	4	3823	September 23, 2024
Client failures when a node is removed Operations	3	1237	September 9, 2017
Does Aerospike have automatic client failover to XDRed clusters? Configuration	2	1462	August 16, 2014
Java client return timeout once one of the nodes is down Client Libraries java	2	862	May 4, 2022

Automatic failover to other nodes?

Related topics