FAQ - What does "node id changed" in the logs mean?

FAQ - What does “node id changed” in the logs mean?

Detail

When changing the node-id of a node and then restarting Aerospike, messages as follows may be displayed in the logs of one of the nodes:

May 20 2019 14:53:07 GMT: WARNING (hb): (hb.c:7389) node id changed from bb9abcd5f211b00 to 100 for node with endpoints {10.20.30.40:3002}

The node in question may also show a cluster size of 0.

Answer

This message may appear if, after changing the node-id, Aerospike is restarted as follows:

$ service aerospike restart

or

$ service aerospike stop && service aerospike start

In essence, if aerospike is restarted within interval * timeout after being stopped (default of 150ms * 10 = 1.5 seconds), then other endpoints (nodes) might not have terminated their connections yet. This may result in the cluster thinking that a node has changed it’s node-id while running.

When changing node-id, it is best to wait for these connections to time out before starting a node once it has been stopped. In default configuration this would be 1.5 seconds however if heartbeat parameters have been changed the waiting time would change accordingly.

If Aerospike is stopped and restarted and the message shown above is displayed, this can be resolved by shutting down the node, waiting for a time period equal to interval * timeout before restarting.

###Note

  • It is normally unlikely for this message to be displayed as in most circumstances starting and stopping the asd process together with the length of time taken to initiate new connections to cluster seed nodes would take longer than the default 1.5 seconds.
  • The asadm tool can be used to check to see whether interval and timeout parameters have been changed from default values. The command to use would be: show config like heartbeat.

Keywords

NODEID CHANGE COLDSTART TIMEOUT INTERVAL

Timestamp

24 June 2019

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.