Restart without Migrations


#1

If I want quick restarts for aerospike config changes only to get reflected but not any migrates to get issued. Since I have repl factor in 2. I should be able to stay good if I’m bringing one node down out of 5.

Below one will sleep migrates for 4hrs between each migrations.

  1. asadm -e ‘asinfo -v “set-config:context=namespace;id=;migrate-sleep=14400000000”’

  2. <make config changes >

  3. service aerospike restart

I shouldn’t face any migrates. And see node is joined in cluster.

  1. Now I can put back migrates-sleep to 10.

Would this be right way of executing restarts without migrations?


#2

This wouldn’t be a useful concept, unless you also planned to stop all traffic. Stopping traffic is normally unwelcome, so there really isn’t any motivation to optimize for it.

That said, since 3.14, you no longer need to wait for migrations between restarts. Also Aerospike Enterprise has Rapid Rebalance which often reduces migration time 10-20x.


#3

This will be useful when I wanted to increase transaction-threads or service-threads to more in aerospike.conf . Lets say I have made service-threads number as 4 and I would like to make it 16. I would like to see the restarts without data migrates and service threads can take place.

Even in 3.15 I will undergo migrates when I restart, which I faced last week.

Enterprise - Yeah . Thats something I don’t understand. Why should it go for migrates at all? Coz I have repl-factor as 2 and taking restart on one node out of 5 shouldn’t cause any data migrates whereas I have replica on another node. I guess we should have this flexibility.


#4

When you remove a node, many partitions lose one of their replicas. While the node is removed, we would like to continue processing transactions, so Aerospike determines an interim replica and advances nodes to write-master for each affected partition and starts migrations to them. When you start the node again, it moves back into its previously held replica positions which bumps the interim repica to a non-replica position. At this time the restarted node is missing updates and inserts that occurred while it was out of the cluster, migrations are needed to sync it with other replicas.

You can disable migrations by setting migrate-threads to 0 dynamically and statically. This only changes whether partition data transfers will take place, there will still be reclusters which determines interim-replicas. The pitfall of this is that without migations, we do not know when a node has all updates/inserts as the other replicas after restarting a node. When the node rejoins, Aerospike will need to resolve the latest copy from the other nodes which adds at least one round-trip of latency to every request. If you really don’t care about losing this data, you can set write-commit-level to master, and read-consistency-level to one (default).

Evaluate your requirements carefully before making any of these changes.


#5

Adjusting threads is recommended approach I think. I believe I observed that if setting a high sleep value, when you dynamically adjust the threshold back down - the partitions/threads which were set to sleep for 4 hours will need to finish sleeping until they pick up the new partition/thread sleep timer.


#6

Makes sense.


#7

A quick update.

We have tested it in 5node cluster in stg with 500M data.

  1. asinfo -v "set-config:context=service;migrate-threads=0" -> No threads will be given for cluster to migrate. (Dynamic Variable).
  2. Restarted node2 aerospike. Ensured migrate-threads = 0 in aerospike.conf as well. Since dynamic change might not take effect after restart.
  3. checked aerospike << info >>
                                   Pending
                                       migrates                Objects (Master,Prole,Non-Replica)                
Host                                       tx, rx                   
stg-aerospiketest001  (417.000,  417.000)       (21.296 K, 11.610 K, 0.000)          
												

stg-aerospiketest002  (410.000,  410.000)       (21.264 K, 12.365 K, 0.000)   

stg-aerospiketest003  (1.640 K, 1.640 K)        (0.000,  32.973 K, 0.000)        --> Restarted node

stg-aerospiketest004 (405.000,  405.000)        (19.795 K, 13.309 K, 0.000)   

stg-aerospiketest005 (408.000,  408.000)        (20.137 K, 12.235 K, 0.000)   

                             (82.492 K, 82.492 K)        (82.492 K, 82.492 K, 0.000)   

Looks like the migrates flag will be on and data transfer will try to kick in but it won’t start nodewise data transfers as we have migrate-threads=0 . Also Master objects are not set for restarted node.

This situation will change only when migrate threads > 0 .

Once migrate-threads is set > 0, the data transfer happens and cluster will be equally spread and migrates column will be 0,0.

            Pending
           Migrates
            (tx,rx)
  (0.000,  0.000)  
  (0.000,  0.000)  
  (0.000,  0.000)  
  (0.000,  0.000)  
  (0.000,  0.000)  
  (0.000,  

Is it possible to stop the migrations/shard movement for a node restart in aerospike? Answer is No


#8

Master objects is accurately set to 0, when a node starts with data, it sets all of its partitions to ‘subset’ because while the node was away the interim master node could have taken writes. During rebalance any node owning a ‘full’ partition will have master preference. Mastership will be restored to the returning node as migrations progress.


#9

I know. I was stressing on the point that restarted node wouldn’t be of any use until migrates are over.


#10

Unsure what you mean by not being useful. The node will hold prole (non-master replicas) and/or non-replica records. If the acting master were to leave, these records will be around and, assuming less than replication factor nodes are missing, be able to resolve the latest version between in and interim-proles.


#11

Let me fill you with the intent of this “Restart without Migrations”. Is it possible to restart one node by mentioning no shard rebalance and have a quick restart. By this way it will take only delta changes to active nodes and after restart is done those delta values will master/replica be put back in restarted node. Whereas in this case restarted node not holding any master role and it needs new mapping of data from beginning. So when I say master data is 0, which means it will be allocated or given new master roles only after complete migrates are over, which is a long process.


#12

This essentially describes what rapid Rebalance does in EE.

Though, as I said, you do not have to wait. The prior master copies are there now as prole copies and there exists a node holding ‘non-repica’ copies while migrations are taking place. Between the prole and the ‘non-replicas’ all of the writes that took place on the interim master are represented. So if you were to restart the next node without waiting for migrations, all data will still be present and will resolve. Basically a portion of the prole records will become master and will resolve with the new prole records which were ‘non-replica’ records.


#13

Mannoj,

What kporter says is right, let me see if I can add some clarification.

Obviously, when you take a node out and put it back, we need to synchronize the changes that happened while the node is out. If you are using Enterprise edition, we will do precisely that, and only migrate the deltas or changes. It is clearly necessary to migrate those changes, or the node that was out for a period will be continually behind. You say you want to “not do migrates” and I simply don’t understand — why would you want a node that was out stay behind forever? Perhaps you can clarify.

If you set the migrate threads to 0 as you did, you avoid the extra work of having migrated the data speculatively. Migrating data speculatively is the safest thing to do, because getting back to your replication factor is safer. However, if you know the node’s data is OK and coming back, then not doing any data migration during the node out is more efficient. I recommend NOT migrating data ( setting threads to 0 ) on a short node out personally, but it is a choice you can make for yourself after thinking through the risks on both sides.

So - does Aerospike allow “not migrating data” - no, because synchronizing the database if a node is out must happen.

Does Aerospike allow avoiding all UNNESSARY data copies? Yes, by doing two things. Stop migrations while you are restarting nodes. And use Enterprise.

If your experiments show you otherwise, please let us know.


#14

I’m having a cluster that is 50GB in-mem per node, totally we have 5nodes. And most of the time it uses for reading purpose, a job runs at a particular time to write data into it. Lets say I have node failure (nodeA is down) and immediately we got a GET request for the data where master-role is present in nodeA. If GET request for that Node if it can be picked by replica, which is what I might expect to happen and serve the GET request but I’m ok for it to Fail for those GETS if consistency is a mandate. But let not migrates to kick off. As the node restart might take some 3 mins to put back the data in-mem and join the cluster, whereas migrates takes 45mins to join. This is the reason behind “No migrates during restart”.

Anyway when I tested, what I observed is. Migrate is a must. And though if anyone tries to set migrate-threads=0. Aerospike won’t have restarted node with Master roles until migrates are ON for restarted-node. Which means you need to undergo migrates for any node restart. And I was trying to solve playing around by disabling migrate threads, looks like not possible.


#15

Two questions.

First, forget reads, they are simple, and Aerospike does what you want - directs reads to replicas. Think about writes. While that node is out, do you want no writes to happen to that 1/5 of the cluster? Because if we don’t transition replicas to masters, you won’t get writes. If you don’t get writes, you don’t have to catch up. Do you want 1/5 of writes to fail for the time the node is out of the cluster?

Second, you say that “migrates takes 45 mins to join”. If you mean the node takes 45 minutes to join, that’s not true - it joins immediately, and over the subsequent 45 minutes, as synchronization finishes, master ownership transfers one by one. Which means that during that 45 minutes, all transactions are served.

Yes, migrates are a must because WE DON’T STOP WRITES, and we don’t allow the stopping of writes. That is the essence of an available database.

Do you want writes to stop for the 1/5 of the data?


#16

I guess we both are talking the same point. Let me put this simple. I was trying to tweak delta migrations only to happen during a node restart by setting stop migrates. Looks like the software won’t and thats what my above test results show up too.


#17

Just to clarify, migrations will not have any impact on the data availability. So whether migrations take 3 minutes or 2 hours will not have any impact on your ability to read the right versions of the records. The latency may be impacted during migrations (duplicate resolution) but that should be about it.


#18

True. Its just that operations team should be available for this activity in night hrs until migrates are completed and call out incase of any issues.


#19

Got it. Yes, that makes sense. Thanks for clarifying.