Migrations and consistency

vishal14101993 · December 10, 2018, 11:52am

I’m working on a 2 node cluster with strong consistency enabled on both the nodes. When a node, which was down for some time is brought up, I can see that some migrations take place. I use the below mentioned command:

asadm -e "asinfo -v 'namespace/test_namespace' -l" | grep partition

Output(on my 2 node cluster)

dead_partitions=0
    unavailable_partitions=0
    migrate_tx_partitions_imbalance=0
    migrate_tx_partitions_active=0
    migrate_rx_partitions_active=0
    migrate_tx_partitions_initial=4096
    migrate_tx_partitions_remaining=4095
    migrate_tx_partitions_lead_remaining=2012
    migrate_rx_partitions_initial=4096
    migrate_rx_partitions_remaining=4095
    partition-tree-sprigs=256
    sindex.num-partitions=32
    dead_partitions=0
    unavailable_partitions=0
    migrate_tx_partitions_imbalance=0
    migrate_tx_partitions_active=0
    migrate_rx_partitions_active=0
    migrate_tx_partitions_initial=4096
    migrate_tx_partitions_remaining=4095
    migrate_tx_partitions_lead_remaining=2082
    migrate_rx_partitions_initial=4096
    migrate_rx_partitions_remaining=4095
    partition-tree-sprigs=256
    sindex.num-partitions=32

Although the number of ‘unavailable partitions’ becomes 0(once the 2nd node is brought up) but you’ll see that there are still some migrations remaining(both in tx and rx). Till the remaining migration count becomes 0, is this cluster-state stable(for read/write). If not, when should I worry about these counts(if at all) and which counts specifically?

I also couldn’t understand the count difference between migrate_tx_partitions_remaining vs migrate_tx_partitions_lead_remaining. After reading the description, I had thought that there shouldn’t be any difference in both the counts since both the nodes are present in the roster. I’ll really appreciate if someone can clarify.

kporter · December 10, 2018, 8:29pm

While SC will work with a 2 node cluster, I strongly suggest using more than replication-factor nodes. With only replication-factor nodes, maintenance events become problematic because the cluster will become unavailable any time you take down a single node. Therefore with replication-factor sized clusters, routine maintenance events, such as upgrades, become zero availability events.

If a partition is available then reads and writes can be made and will be consistent, if a partition is unavailable then your client with get an unavailable result code. Your application shouldn’t need to be concerned with these stats. However, your monitoring environment should incorporate these stats.

The migrate_tx_partitions_lead_remaining are a subset of migrate_tx_partitions_remaining. The lead migrations are not delayed by the migrate-fill-delay configuration. This is a separate topic, if you are interested read up on migrate-fill-delay.

system · December 17, 2018, 8:19am

This topic was automatically closed 6 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How consistency is guaranteed for write during migration Migration	8	1590	November 6, 2019
Migrations TX constantly Climbing Migration	3	1986	May 13, 2016
How can I tell when a migration is finished? Monitoring	3	5567	August 16, 2014
Ansible - check for migrations Operations	2	136	April 11, 2025
Migrations are stuck for over 1 week Migration	12	4306	January 23, 2016

Migrations and consistency

Related topics