Managing Migrations


#1

Question:

How do I temporarily tune migrations?

Answer:

Migrations do not take a set amount of time. You can configure the amount of resources dedicated to the migration. If you use the default configurations for common use cases, you should typically see a minimal impact on performance during migrations.

Refer to the tuning migrations documentation page for further details.

Review this knowledgbase for details on monitoring migrations: FAQ - Monitoring migrations on a live Aerospike cluster

For server version 3.7.5 and above

you can use the following configurations to tune migrations: migrate-threads, migrate-sleep, and migrate-order.

  1. migrate-threads You can modify the number of threads that perform migrations. Use the following command to dynamically configure migrate-threads for all nodes in the cluster:

         asadm -e 'asinfo -v "set-config:context=service;migrate-threads=<number of threads>"'
    

    Default value for migrate-threads is 1. https://www.aerospike.com/docs/reference/configuration#migrate-threads

  2. migrate-sleep You can modify the migrate-sleep configuration to change the time (in microseconds) that migrations sleep after each record migration. Setting this to 0 will completely remove this throttling. Use the following command to dynamically configure migrate-sleep for all nodes in the cluster:

         asadm -e 'asinfo -v "set-config:context=namespace;id=<namespace>;migrate-sleep=<sleep time in microseconds>"'
    

    Default value for migrate-sleep is 1.

https://www.aerospike.com/docs/reference/configuration#migrate-sleep

  1. migrate-max-num-incoming You can modify the maximum number of partitions that a node can have as incoming. By default, this is set to 4. In versions earlier than 3.10, this value was set to a default of 256.

         asadm -e 'asinfo -v "set-config:context=namespace;id=<namespace>;migrate-max-num-incoming=<max number of incoming partition allowed>"'
    

https://www.aerospike.com/docs/reference/configuration#migrate-max-num-incoming

  1. Channel bulk settings You can change the number of bulk channel sockets and the threads processing intra-cluster messages for server version 3.11 and above.

https://www.aerospike.com/docs/reference/configuration#channel-bulk-recv-threads

https://www.aerospike.com/docs/reference/configuration#channel-bulk-fds

  1. migrate-order You can use the migrate-order configuration to prioritize namespace migrations. migrate-order is a value between 1 and 10 which determines the order of migrations. The namespace with migrate-order 1 is processed first, and the namespace with migrate-order 10 is processed last. migrate-order can be helpful if you need to prioritize migrations for one namespace over migrations of another namespace. If you use non persisted in-memory namespaces, you could prioritize their migrations in order to proceed further with a rolling upgrade without waiting for migrations on all namespaces to complete (if the nature of the workload being applied during the migrations permits it for the use case).

Use the following command to dynamically configure migrate-order for all nodes in the cluster:

        asadm -e 'asinfo -v "set-config:context=namespace;id=<namespace>;migrate-order=<order value>"'
        
Default value for **migrate-order** is 5.

https://www.aerospike.com/docs/reference/configuration#migrate-order

For version 3.7.4 and earlier:

Migrations can be controlled by the following config parameters: migrate-xmit-hwm , migrate-xmit-lwm, migrate-threads.

The above configuration parameters are in the service context of the configuration file.

  1. Water-marks Try increasing migrate-xmit-hwm and migrate-xmit-lwm in small increments, for single migrate thread configs . It is typically not recommend going above 60 and 20 respectively.

    To change these in 3.x use:

        asmonitor -e 'asinfo -v "set-config:context=service;migrate-xmit-hwm=<new_value>"'
        asmonitor -e 'asinfo -v "set-config:context=service;migrate-xmit-lwm=<new_value>"'
    

    The way these parameters interact is whenever the queue is greater than migrate-xmit-hwm it will stop migrating data. Whenever it goes below migrate-xmit-lwm it will start migrating data.

    For 2.x versions, use clmonitor instead of asmonitor.

  2. Priority Following settings control the throttling for migrations. Setting them to zero will disable throttling.

    To change these in 3.x use:

        asmonitor -e 'asinfo -v "set-config:context=service;migrate-read-priority=<new_value>"'
        asmonitor -e 'asinfo -v "set-config:context=service;migrate-xmit-priority=<new_value>"'
    

    migrate-read-priority : Controls Disk i/o throttle for data migration. The value is the number of records to read before sleeping for ‘migrate-read-sleep’ milliseconds. Setting this to zero will disable this throttling knob.

    migrate-xmit-priority: Is Number of records to ship before sleeping for ‘migrate-xmit-sleep’ milliseconds. Setting this to zero will disable this throttling knob.

    For 2.x, use clmonitor instead of asmonitor.

  3. Migration Threads On versions since 3.2.0 and 2.7.9 you can dynamically increase/decrease the number of migration threads. Prior to those, the following changes will error out with no changes. So, explicit changes to the configurations will be required.

    And on 3.2.0+:

        asinfo -v "set-config:context=service;migrate-threads=N"
    

    To set across the cluster you can use asmonitor: Note: this will also affect 2.7.9+ nodes in the cluster, prior version will return an error and ignore the config change.

        asmonitor -e 'asinfo -v "set-config:context=service;migrate-threads=N"'
    

    For 2.x, use clmonitor instead of asmonitor and clinfo instead of asinfo.

Note:

  • Increasing the migration rate may add additional latency to regular read write transactions. Ideally, one should be able to see this reflected in the server trends. Exceptions may have latency changes observed only on the client. Make sure to revert back to the default values once migrations are complete.
  • You will not be able to accurately predict when a migration completes because of various variables: load on system, cluster size, migration tables, etc. But there are ways we can extrapolate to see how much migration occurs over a period of time and then estimate it, (as long as cluster state do not change and nodes are not restarted, which would then reset everything). For server version 3.7.+, the server logs display the percentage migrations completed on a per-namespace basis.
  • Times will vary depending on a few different factors, including the network and amount of data per node. For typical loads, expect a migration to take one to several hours.

Keywords

migrations tuning speed slow

Timestamp

11/10/2017


FAQ - Monitoring migrations on a live Aerospike cluster
Migrations stuck on a node
What is the delay between node dies and rebalancing process occurs?
Cluster became broken
Migration appears to ignore migrate order