How long does a migration take?


#1

by young » Sun Nov 11, 2012 11:17 pm

How long does a migration take?

Migrations do not take a set amount of time. You can configure the amount of resources dedicated to the migration. Aerospike has default settings that will ensure that if a migration occurs, it will not impact the performance of the cluster.

You will not be able to accurately predict when a migration completes because of various variables: load on system, cluster size, migration tables, etc. But there are ways we can extrapolate to see how much migration occurs over a period of time and then estimate it, (as long as cluster state do not change and nodes are not restarted, which would then reset everything).

Migrations can be controlled by the following config parameters: “migrate-xmit-hwm” , “migrate-xmit-lwm”. Also a static value “migrate-threads” which we recommend at 1 and if changed, requires node restart.

These configuration settings are in the main Aerospike configuration file “/etc/citrusleaf/citrusleaf.conf” in the “service” area.

CODE: SELECT ALL
service {
    ...
    migrate-xmit-hwm 6
    migrate-xmit-lwm 1
    migrate-threads 1
    ...
}

The way these parameters interact is whenever the queue is greater than migrate-xmit-hwm it will stop migrating data. Whenever it goes below migrate-xmit-lwm it will start migrating data.

Times will vary depending on a few different factors, including the network and amount of data per node. For typical loads, expect a migration to take one to several hours.


#2

kporter,

I have two questions:

  1. “citruskeaf.conf” is the old configuration. are these configuration settings suitable for the new configuration “aerospike.conf”?
  2. What is a typical loads?

Thanks!


#3

Yes these settings work for aerospike as well: You can find a full list by entering ‘migrate’ into the search box on our configuration reference page. The parameters listed here are very old recommendations and were too conservative, the new defaults are higher and can be found on the configuration reference page.

I’m not sure how to quantify this. Currently estimating the amount of time remaining for migrations to complete isn’t an exact science, hopefully we will be able to improve this situation in the near future. Presently you can see the number of partition migrations scheduled to be sent from each node can be collected with:

asadm -e "show stat like migrate_progress_send"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                 :   debian.local:301   debian.local:302   debian.local:303   
migrate_progress_send:   1919               1875               0         

You can then sum this across the cluster: 1919 + 1875 + 0 = 3794 Record that value and run again after a few minutes.

asadm -e "show stat like migrate_progress_send"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                 :   debian.local:301   debian.local:302   debian.local:303   
migrate_progress_send:   332                266                1  

332 + 266 + 1 = 599 Assuming these are collected 60 min apart: (3794 - 599) / 3600s = 0.8875 Partitions per second.

This will not be completely accurate because some partition migrations will depend on other partition migrations to complete before they are scheduled. See how my node 303 was first 0 and then 1, it later scheduled a partition to be migrated. Also another problem with this method is that some of the initial migrations are very short which will make your rate overly optimistic.


#4

Thanks for your explanation! :smile: