Strange behavior during migration


#1

Hi again,

Sorry for “spamming” your forum, but experiencing a lot of “fun” thinks in the moment.

Okay so I got 2 nodes both upgraded to 3.5.12 and while migrating they just stalled.

Node 1 has reached its high-water mark, the IO works at 100% but the migration was still going and because of yesterdays crash there are migration going to and from both nodes.

I decided to push it a little more so I started our collector program which both do reads and writes.

After half an hour the amount of used memory started to grow while the migration stopped, and the Aerospike Client stopped to read and write.

I stopped our program to see if we stressed it to much, and gave the servers a couple of hours to calm down. But it doesn’t!

The Aerospike service still uses 77% memory. The migration are still not processing.

I can’t seem to figure out what the service are doing or waiting for? I can’t see other options than trying to restart one of the servers to see if the will start processing again?

I’ve added below two screendumps of Aerospike dashboard and or server monitor (NewRelic) to show you a snapshot of we are seeing the last couple of hours.


#2

Have you seen any change in the “Migrations Outgoing” in AMC?


#3

At first, yes there was changes in the Migrates Outgoing on both server.

One of the servers counted up, the other down.

After some time, it stopped and there was no changes in after that.


#4

Lars,

The 2 nodes in the Aerospike cluster seems to have very different RAM and storage capacities. Aerospike suggests identical configurations for all nodes in a cluster.

Node 1 : IP .88 has ~22GB RAM Node 2: IP .82 has ~54GB RAM

Node 1 has hit the high water mark and has been evicting objects, where as node2 has a lot of space capacity remaining (both RAM and Storage).

Is it possible to share your configurations files (aerospike.conf) from both the nodes?

-samir


#5

Hi Samir,

You are correct, that the two servers have different specs. The reason for this is, that it was planned to shutdown the old server, when all data was migrated to the new server. So according to the documentation, the two nodes should be synced with the max usage of the smallest node. (leaving a lot of unused ressources on the largest node. So when we shutdown the old server, the unused ressources on the large node, should be released for use.

Old server config:

# Aerospike database configuration file.

# This stanza must come first.
service {
        user root
        group root
        paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
        pidfile /var/run/aerospike/asd.pid
        service-threads 4
        transaction-queues 4
        transaction-threads-per-queue 4
        proto-fd-max 15000
}

logging {
        # Log file must be an absolute path.
        file /var/log/aerospike/aerospike.log {
                context any info
        }
}

network {
        service {
                address any
                port 3000
        }

        heartbeat {
                mode multicast
                address 239.1.99.222
                port 9918

                # To use unicast-mesh heartbeats, comment out the 3 lines above and
                # use the following 4 lines instead.
#               mode mesh
#               port 3002
#               mesh-address 10.1.1.1
#               mesh-port 3002

                interval 150
                timeout 10
        }

        fabric {
                port 3001
        }

        info {
                port 3003
        }
}

namespace audience {
        replication-factor 2
        memory-size 24G
        default-ttl 0 # forever

        storage-engine device {
                file /opt/aerospike/data/audience.dat
                filesize 164G
#               data-in-memory true
        }
}

New server config:

# Aerospike database configuration file.

service {
        user root
        group root
        paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
        pidfile /var/run/aerospike/asd.pid
        service-threads 4
        transaction-queues 4
        transaction-threads-per-queue 4
        proto-fd-max 15000
}

logging {
        # Log file must be an absolute path.
        file /var/log/aerospike/aerospike.log {
                context any info
        }
}

network {
        service {
                address any
                port 3000
        }

        heartbeat {
                mode multicast
                address 239.1.99.222
                port 9918

                # To use unicast-mesh heartbeats, remove the 3 lines above, and see
                # aerospike_mesh.conf for alternative.

                interval 150
                timeout 10
        }

        fabric {
                port 3001
        }

        info {
                port 3003
        }
}

namespace audience {
        replication-factor 2
        memory-size 56G
        default-ttl 0 # 30 days, use 0 to never expire/evict.

        # To use file storage backing, comment out the line above and use the
        # following lines instead.
        storage-engine device {
                device /dev/sdb
                data-in-memory false # Store data in memory in addition to file.
        }
}

#6

Hi guys,

Any thoughts on this?


#7

Hi Lars, Here is how I understand your setup.

Node 1: 192.168.30.188

Disk : 42GB consumed, 120 GB free Memory: 14.4GB consumed, 9.6GB free Replicated objects : 176M Master objects: 86.3M Replica objects: 31K

Node 2: 192.168.30.182 Disk: 11GB consumed, 538GB free Memory: 3.8GB consumed, 52.2 GB free Replicated objects: 49 M Master objects: 23.1M Replica objects: 3.8M

Total: Disk : 54GB used, Memory: 18GB used 109.5M master, 3.8M replica objects

Each node in steady state would have : ~109M objects - roughly 50% master and 50% replica Memory required: ~14-16GB Disk required: ~45-50GB

Node 1 situation: 24GB RAM, 60% HWM, 14.4GB is HWM - it has started evicting objects, migrations incoming/outgoing are on, and also read/writes are on. The image doesn’t show avail-pct numbers.

My gut feel is that the defrag pace couldn’t keep up and you might have hit the min-avail-pct (http://www.aerospike.com/docs/reference/configuration/#min-avail-pct)

Probably solutions are:

  1. If you have additional RAM on the node 1, you could change the memory from 24G to higher for the namespace, this should get you out of HWM situation
  2. Bump up defrag speed to free up the available free blocks so that the migrations could resume. (http://www.aerospike.com/docs/reference/configuration/#defrag-sleep). If you have reached min-avail-pct close to 0, then you must do the step #1 above in order to get some extra room.

Let me know if this helps.

-samir