How to upgrade to 3.15 or above for a cluster with LDT records?


#1

How to clean up LDT’s from an active cluster in order to upgrade to 3.15 or above

Context

As of server version 3.15.0.1, Aerospike removed the deprecated LDT (Large Data Type) feature. If you have a cluster and application(s) that continue to use LDT on a version prior to 3.15, you would need to follow the steps below in order to cleanly remove LDT records.

Method

3.14 is the last major Aerospike server release to include LDT code. In order to remove all LDT records from a cluster, you would need to first upgrade to 3.14.1.8 before proceeding forward. A fix released in that version is indeed necessary to make sure LDT records are properly removed:

  • [AER-5804] - (LDT) With LDTs disabled, LDT records from other nodes, and during warm restart, are not handled properly.

Releases 3.15.0.1 and above do not understand the LDT data type and interpret existing LDT’s as a regular record and may cause unexpected behavior.

Steps to gracefully upgrade without downtime:

1. Ensure that client application(s) are no longer writing new LDT records to Aerospike cluster.

In a rolling fashion, proceed with the following steps:

2. Disable LDT on the relevant namespaces in the Aerospike configuration file by adding ldt-enabled false to as below:

namespace ldt_namespace_device {
  replication-factor 2
  memory-size 20G
  
  storage-engine device {
    device /dev/sdd
    write-block-size 1024K
    ldt-enabled false
  }
}

namespace ldt_namespace_memory {
  replication-factor 2
  memory-size 20G
  
  storage-engine memory {
    ldt-enabled false
  }
}

3. Upgrade to version 3.14.1.8.

It is strongly recommended to fully erase the devices of any persisted namespace in order to make sure that a cold restart on a subsequent version (3.15 and above) will not cause undesired behavior as those versions do not know how to interpret any LDT related data.

Version 3.14.1.8 will gracefully reject any incoming LDT record through migrations when the ldt-enabled configuration is set to false. Gracefully meaning the migrations will still complete. Version 3.15 or above will fail receiving such LDT records and will cause migrations to get stuck. It is therefore necessary to go through version 3.14.1.8 as a jump version if LDT records are still in the system.

Restart the node:

sudo service aerospike restart

If devices were fully erased, make sure to wait for migrations to complete prior to proceeding to the next node.

4. Once the cluster restarts are completed, it should no longer contain any LDT data.

To upgrade to 3.15 and beyond, you can remove the ldt-enabled false configuration from aerospike.conf file and proceed.

Notes

Keywords

LDT upgrade

Timestamp

02/14/2018