Hi. I’m curious about the nature of asrestore over high latency (long distances).
I am restoring a 500gb namespace to a cluster. I tried 2 ways
Backup to a server in Toronto, and run asrestore over the network to a server in Tokyo.
Copy the backup from Toronto to Tokyo and run asrestore entirely in Tokyo.
In scenario #1: I was comparing/restoring about 130 records per second. In scenario 2 it’s about 90,000 per second. ETA in Scenario one was always more than 100 days. Scenario 2, about 9 ish hours. +/-
I’m curious about the mechanisms in play here? Is there a lot of back and forth between asrestore and the cluster that accounts for this large of a discrepancy?
Also related/not related , is asbackup | asrestore a pattern that is ever employed to easily seed a cluster ?
That’s because it restores the records 1-by-1 so there is a lot of chatter/await. You’d probably get similar better results streaming over the internet with more threads, but I like being able to transfer the file and do a md5sum for a warm fuzzy feeling. For piping commands together, I’m a fan of asbackup -o - | pzstd -1 | cat | nc <remote details> and on a single node on the remote side you can nc listen | cat | pzstd -d| asrestore -i - or even on multiple nodes if you want to make it complicated (multiple asbackup/asrestore sites and each doing different series of partitions).
The commands I pasted are probably not 100% complete, but thats the jist of it mostly. Netcat to create the tcp tunnel and then use some compression before sending it over the wire. I think Aerospike may have even added a compress option to their tools but can’t recall… may be worth looking into.