Rack aware cluster does not form


#1

Rack aware cluster does not form

Problem Description

In an Aerospike cluster where using mesh networking where interface-address and access-address are set correctly, the cluster does not form and reports integrity faults. There are no connectivity issues such as firewalls and connectivity outside Aerospike is fine. Log messages as shown below are reported in aerospike.log.

Aug 03 2016 16:25:01 GMT: INFO (paxos): (paxos.c:2375) Corrective changes: 0. Integrity fault: true
Aug 03 2016 16:25:01 GMT: INFO (paxos): (paxos.c:2423) Marking node add for paxos recovery: bb980fe200204a0
Aug 03 2016 16:25:01 GMT: INFO (paxos): (paxos.c:2423) Marking node add for paxos recovery: bb980fe200204a0
Aug 03 2016 16:25:01 GMT: INFO (paxos): (paxos.c:2423) Marking node add for paxos recovery: bb980fe200204a0
Aug 03 2016 16:25:01 GMT: INFO (paxos): (paxos.c:2174) Got reset event. Forcing paxos spark.
Aug 03 2016 16:25:01 GMT: INFO (paxos): (paxos.c:1975) as_paxos_spark establishing transaction [1470241501]@bb980fe200204a0 ClSz = 1 ; # change = 3 : ADD bb980fe200204a0; ADD bb980fe200204a0; ADD bb980fe200204a0; 
Aug 03 2016 16:25:01 GMT: INFO (paxos): (paxos.c:2960) {1470241501} principal acking it's prepare bb980fe200204a0
Aug 03 2016 16:25:04 GMT: INFO (paxos): (paxos.c:2742) as_paxos_retransmit_check: principal bb980fe200204a0 retransmitting sync messages to nodes that have not responded yet

Restarting nodes or using dun/undun does not resolve the issue. All nodes show a cluster size of 1.

Explanation

This issue can occur when the cluster is rack aware but the mode has not been set or when rack aware is not specified at all, and InfiniBand networking is being used. InfiniBand provides a longer MAC address for a given network interface. When not in rack aware mode, node ID is derived from the MAC address of the node. Aerospike only parses the first 48 bits of the MAC address to derive this ID. Therefore, if the mode is left as none, rack aware is not active and node ID will be computed from the MAC address. In this scenario, all nodes would have the same Node ID meaning that it would not be possible for them to join the cluster.

The network section of a collectinfo output would look as follows:

Wed Aug 3 14:53:22 EDT 2016
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Node                                  Node            Ip              Build  Cluster  Cluster   Cluster   Principal   Client  Uptime
.                                      Id             .                 .     Size      Key    Integrity     .        Conns   .
nd-kvs-1.ldn.customerclster.com:3000 *BB980FE200204A0 10.0.11.19:3000  E-3.8.4 1 D4F7A44E811853A4 True BB980FE200204A0 14768 120:44:01
nd-kvs-2.ldn.customerclster.com:3000 *BB980FE200204A0 10.0.11.126:3000 E-3.8.4 1 318F27E5202B1801 True BB980FE200204A0 51     01:41:48
nd-kvs-3.ldn.customerclster.com:3000 *BB980FE200204A0 10.0.11.125:3000 E-3.8.4 1 2E56D0CD94EE8BC6 True BB980FE200204A0 13     01:42:29
nd-kvs-4.ldn.customerclster.com:3000 *BB980FE200204A0 10.0.11.124:3000 E-3.8.4 1 E8554C67333E5616 True BB980FE200204A0 3      01:41:12

Solution

To resolve this, switch on rack aware mode in aerospike.conf. The static mode for rack aware can be used. This allows node IDs to be specified explicitly meaning that parsing of the MAC address is not necessary, the self-node-id parameter is used to do this. The cluster stanza might look as follows in aerospike.conf

cluster {
    mode static
    self-node-id [32-bit unsigned integer node ID]
    self-group-id [16-bit unsigned integer group ID]
}

Unless desired it would not be necessary to set up multiple groups for rack aware, a single group can be used if rack aware is not necessary in the particular use case.

Notes

  • The rack-aware feature is an Aerospike Enterprise Edition Server only feature as of version 4.0.

  • A JIRA exists to address full parsing of InfiniBand MAC addresses, it is AER-5198

  • The mode parameter within the cluster stanza_ used to switch on rack aware is static and unanimous meaning that a full cluster restart is required to change it.

http://www.aerospike.com/docs/reference/configuration#mode

  • Information on self-node-id

http://www.aerospike.com/docs/reference/configuration#self-node-id

  • General information on rack aware

http://www.aerospike.com/docs/architecture/rack-aware.html

  • How to configure rack aware

http://www.aerospike.com/docs/operations/configure/network/rack-aware

  • In addition to configuring the cluster stanza in aerospike.conf the Paxos Protocol must be set to V4 to use rack aware

Keywords

RACK AWARE INFINIBAND NODE ID CLUSTER INTEGRITY FAULT TRUE DYNAMIC

Timestamp

8/4/16