Hi,
I am planning on upgrading several clusters from 4.5.0.5 to 4.8.0.2. I am taking a look at this document https://www.aerospike.com/docs/operations/upgrade/aerospike/special_upgrades/4.5.1/ and have a couple of quick questions.
- According to the doc:
This could result in an imbalance of master/replica objects in such clusters.
It is not clear to me what impact this has on a cluster if there is an imbalance. I do not intend to have a mixed cluster for long periods of time (let’s say 2 days max). What gets affected due to this imbalance?
-
Does the imbalance self heal over time? If not, then how does one rectify this non-ideal state.
-
Is there a way to pause expirations/evictions while doing the maintenance? I see there is set-disable-eviction
, but not one for expirations.
The way we do evictions and expirations changed between these versions. If you cluster doesn’t normally evict or expire records they you don’t really have anything to worry about.
If your cluster does evictions or expirations regularly then there will be a period where some of the eviction aren’t taking place during the upgrade causing an imbalance that will resolve itself as more nodes are upgraded.
While in a mixed state, the new nodes will be able to evict or expire replicas from other new nodes and old node will be able to do the same with other old nodes. The problem is that old and new nodes cannot evict across the version divide. This causes replica objects to accumulate which could lead to increased master eviction.
Is this an Enterprise or Community cluster? On Enterprise, upgrades go much faster with the warm-start (aka Fast Restart) feature (though if you are using secondary indexes, building secondary indexes currently adds a significant amount of time to startup).
Hi @kporter, thanks for the quick response!
If you cluster doesn’t normally evict or expire records they you don’t really have anything to worry about.
We have 2 namespaces.
One that regularly expires (usually in bursts).
The other namespace has constant evictions due to the high volume of writes hitting the high water mark. However, this namespace isn’t relied on (things are writing to it, but we’re not quite ready to start reading from it yet).
that will resolve itself as more nodes are upgraded.
Say we have an imbalance. So once the old node is upgraded, I can expect the expired replica objects that missed the initial purge due to the version mismatch to be purged?
This causes replica objects to accumulate which could lead to increased master eviction.
Basically if I have enough head room in terms of free memory that we won’t hit the high water mark even if expired objects aren’t purged, I should be good?
Is this an Enterprise or Community cluster?
Enterprise
Yes, the new system does purge by sharing a timestamp. The old system did so by issuing delete transactions.