Aerospike upgrade using a parallel upgraded cluster setup

We have an 8 node cluster setup in AWS. Currently our application servers as well as our aerospike cluster is part of the same VPC but separate subnets. Our application servers use the private IP addresses of the aerospike instances in order to connect to them.

We are planning to upgrade our aerospike cluster from 3.12.1.1 to 4.5.3.2. Instead of going for an inplace upgrade, we are planning to spin up a parallel upgraded cluster and eventually move traffic into the new cluster. We had a question regarding a seamless traffic shifting from the old to the new cluster. With the current setup and upgrade procedure, a traffic shifting as you might have guessed would involve changing the private IPs in the application server configurations and doing a rolling update/deployment.

We would want a seamless upgrade to be one in which no application configuration changes are required. This would mean the application servers still point to the original endpoint and we are able to change the servers behind those endpoints. As far as best practices are considered, what is the ideal way of designing the infra with aerospike for seamless upgrades(major version upgrades as well as patch version upgrades)!

If you can setup a DNS endpoint with a healthprobe (port 3000 up) you can then add your servers to this dns pool, and clients can query this for their initial connection. I think you may even be able to get away without using a health probe but not sure if the dns query would return the first record, or all records and try all of them in the client connection - may be worth testing. This would solve the problem of having to update the seed nodes on the client.

As far as spinning up a new cluster and moving away from the old, I’ve done this many times… The quiesce feature is a life saver. We stood up a cloudformation stack with the new nodes, joined them to the cluster, and then quiesced the old nodes. You could also just leave 1 of the old nodes online in a quiesced state while you’re working on updating the seed nodes. Another workaround for you in this particular state is that you should be able to decomission one of the old nodes, and then put that ip address onto one of the new nodes (assuming same subnet).

2 Likes