Migrations very slow on a unloaded cluster

I have an 8 node cluster. Each machine has nvmes and plenty of ram.

I had to take 2 nodes out and put them back in. I found the migrations to be incredibly slow.

The usage pattern of the clusters is this

  • We insert a lot of data to this cluster
  • This cluster XDRs out to 8 others
  • There is no other read on this cluster.

During the migrations, the CPU was low, like 40% (each machine is 40 cpus), Disk IO was teeny tiny like 40Mb/s per node, and network was not saturated at all (bonded 10 gig nics into a 40gig top of rack switch)

I set

  • migrate-max-num-incoming to 256
  • migrate-threads to 6
  • migrate-sleep to 0

Are those safe to turn up more? Are there others?

Thanks

If you are using XDR, I assume you are using Enterprise Edition (EE). Which server version? What is the replication factor? 2? You must have access to Support if using EE. They should be able to help.

It’s EE, but for … a series of reasons it’s version 4.9 still and Aerospike will not allow support on it any more. So I wanted to get the community’s opinion.

You could try increasing the number of channel-bulk-recv-threads.