Sorry for “spamming” your forum, but experiencing a lot of “fun” thinks in the moment.
Okay so I got 2 nodes both upgraded to 3.5.12 and while migrating they just stalled.
Node 1 has reached its high-water mark, the IO works at 100% but the migration was still going and because of yesterdays crash there are migration going to and from both nodes.
I decided to push it a little more so I started our collector program which both do reads and writes.
After half an hour the amount of used memory started to grow while the migration stopped, and the Aerospike Client stopped to read and write.
I stopped our program to see if we stressed it to much, and gave the servers a couple of hours to calm down. But it doesn’t!
The Aerospike service still uses 77% memory. The migration are still not processing.
I can’t seem to figure out what the service are doing or waiting for? I can’t see other options than trying to restart one of the servers to see if the will start processing again?
I’ve added below two screendumps of Aerospike dashboard and or server monitor (NewRelic) to show you a snapshot of we are seeing the last couple of hours.