We have few aerospike sets in one namespace (that is setup in high availability mode) the total size of those sets is around 1 TB, we would like to take backup and restore those sets into a different namespace (setup in strong consistency mode) on the same cluster.
I am looking for help related to below points in context of above problem -
- Is asbackup and restore the best option to do migration in such a scenario?
- Can asbackup and asrestore of those sets be performed in parallel rather than sequential? Though as per documentation asrestore accepts multiple directories does it handle restore in parallel (is there similar option for asbackup ?) There is an old issue (Asbackup of multiple sets?) related to a similar but could not find any update on this.
- Is there any option to do such movement as we want to reduce downtime of our application that is currently required during asbackup and asrestore?
- The answer depends on your needs. It is usually best to write your own application to migrate based off your own needs. Does it need to be lazy migrated? Does it need to be 100% available and consistent during the migration? Questions like this make it into an argument for custom migration.
- Kind of, it depends on what you’re asking. I’m not sure I understand what exactly you mean when you ask for something in parallel. I think what you might want is piping them together, ex.
asbackup -n myns -s myset -o - | asrestore -i - -n myns,newns would stream backup from
myns.myset into output of
-, stdout, then pipe into
asrestore with input of stdout
-i - and have it relabel the namespace on consumption
-n oldnamespace,newnamespace. Both of these utilities have options for parallelization.
- There’s nothing in Aerospike or Aerospike’s tool, to my knowledge, that can seamlessly perform a migration from 1 namespace to another with no downtime.
asrestore are the best we have.
See docs for set selection, threads, partition parallelization, etc
Thanks @Albot , regarding point 2 of parallelising asbackup and asrestore, the ask is to run backup of multiple sets in a namespace in parallel, while asrestore can handle it I am unable to confirm if and how to handle it with asbackup.
for e.g. if there are 3 users, accounts, orders in namespace prod , each with 250GB of data in that case we would like to run backup of all 3 sets in parallel (to reduce downtime) instead of doing it one at a time.
The way that asbackup works is that it uses a scan which crawls through every entry in the database 1-by-1. That might be different if there is a set index I don’t know.
Either way, it looks like you can specify a list of sets to back up and I think it will be streaming both of them in parallel, at least as it finds them it will ship them. I do not think running multiple asbackup/asrestore processes would help because of how scans work.
So I do think that using
asbackup --set myset1,myset2 -n myns ... is the best option and it will parallelize as much as it can (given the constraints on number of threads/partitions you pass via parameters).