Upgrade from 4.3.0.6 to 4.4.0.6, data rollback

arsis · December 14, 2018, 7:13am

I am trying to upgrade to 4.4.x.x from 4.3.x.x And I tested on test server(centos 7).

But some data changed to old data in short term.

I upgraded with below process.

stop aerospike server : sudo systemctl stop aerospike
download 4.4.0.6
install 4.4.0.6 : sudo ./asinstall
start aerospike server : sudo systemctl start aerospike

Any idea please ?

my aerospike.conf is below

service {
        paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
        proto-fd-max 15000
}

logging {
        console {
                context any info
        }
}

network {
        service {
                address any
                port 3000
        }

        heartbeat {
                mode multicast
                multicast-group 239.1.99.222
                port 9918

                # To use unicast-mesh heartbeats, remove the 3 lines above, and see
                # aerospike_mesh.conf for alternative.

                interval 150
                timeout 10
        }

        fabric {
                port 3001
        }

        info {
                port 3003
        }
}

namespace set {
        replication-factor 2
        memory-size 2G
        default-ttl 0 # 30 days, use 0 to never expire/evict.

        storage-engine device {
                file /opt/aerospike/data/set.dat
                filesize 25G
                data-in-memory false # Store data in memory in addition to file.
                write-block-size 128K
        }
}

namespace play {
        replication-factor 2
        memory-size 2G
        default-ttl 0 # 30 days, use 0 to never expire/evict.

        storage-engine device {
                file /opt/aerospike/data/play.dat
                filesize 25G
                data-in-memory false # Store data in memory in addition to file.
                write-block-size 4M
        }
}

Albot · December 15, 2018, 4:06am

That should work I would think. What problem are you having? Are you trying to get old data back by installing an older version of the daemon?

meher · December 17, 2018, 4:23am

Using the Community Edition, any restart will be a cold restart one and potentially resurrect deleted data.

The other thing I can think of is a generation wrap around while a node was down to be upgraded, causing the older (but with higher generation) record to take over when the node comes back (refer to conflict-resolution-policy).

There may be other edge situations but would be a bit less common. Enterprise Licensee can provide logs to Aerospike Support for in depth analysis.

arsis · December 18, 2018, 10:23am

Thansk for reply.

How can I upgrade aeropike with data safe ? What is best way ?

Backup and restore ?

meher · December 18, 2018, 9:12pm

Backup and restore is definitely one common way. Probably the most straight forward, but would of course assume a pause in the write traffic to be as consistent as possible.

arsis · December 19, 2018, 4:33am

Thanks Meher for quick reply. Does any other solution is possible? except ‘backup ad restore’.

meher · December 20, 2018, 12:04am

Well, we are going by the assumption that it is the cold restart that is resurrecting deleted records. But it could be a number of things. The alternate suggestions from my side all involve the Enterprise Edition (to avoid cold restarts, to potentially use XDR to directly migrate to a different cluster, and maybe consider strong consistency / durable delete to fully close the door on any inconsistencies).

arsis · December 20, 2018, 3:11pm

Thanks Please explain me what is exact meaning of “maybe consider strong consistency / durable delete to fully close the door on any inconsistencies” ?

meher · December 21, 2018, 1:53am

Sure. So, when you referred to ‘some data changed to old data’ it means that you ended up with non consistent data. Now the source of the inconsistency can vary and we just made a guess that it could have been caused by the cold restart. You can of course decide to empty the storage on a node before restarting it, waiting for migrations to complete before moving on to the next node, if that is the cause of the inconsistency…

There are other situations that could cause inconsistencies (split brains) when Aerospike operates in Available Mode. Aerospike Enterprise Edition can be configured to run in Strong Consistency Mode.

Running in strong consistency mode defaults to using durable deletes which would create tombstones and prevent resurrection of deleted records upon cold restart.

Hope this helps… but for your case, if the cause of the inconsistent data is the cold restart in the Community Edition, you could consider deleted the storage upon restart and wait for migrations to re-fill prior to moving to the next node.

arsis · January 11, 2019, 5:26am

Thanks Meher.

I am using community edition, so I cant use “Strong Consistency Mode”. Trying to resolve this problem, by adding “deleted” field in the object.

I think community edition should have “Strong Consistency Mode”, because this is an unexpected result of normal database.

kporter · January 11, 2019, 5:42am

If you can support your data needs on a single node (like a normal database) then you wouldn’t need to trade off either consistency or availability.

arsis · January 11, 2019, 9:16am

Hi Kporetr.

I dont agree with you that normal database only works fine on a single node. Most of databases(community edition) provide cluster mode, and they dont have this issue.

Topic		Replies	Views
Aerospike Database 6.0.0.1 (May 4, 2022) Releases (Server, Client & Tools)	0	499	May 5, 2022
Aerospike Database 6.0.0.0-rc7 (April 20, 2022) Releases (Server, Client & Tools)	0	555	April 21, 2022
Non-ACID compliant upgrades Upgrading	5	1565	October 1, 2016
Aerospike slow performance write/batch-read	3	3564	October 16, 2017
All data(stored on device) lost after upgrading from 3.7.1 to 3.7.3 Upgrading	2	1628	February 18, 2016

Upgrade from 4.3.0.6 to 4.4.0.6, data rollback

my aerospike.conf is below

Related topics