How to speed up asrestore


#1

FAQ How to speed up asrestore

Context

How to increase the speed of restoring data with asrestore.

Method

One of the main option to the asrestore process is the number of threads to spawn for writing to the cluster.

        -t <threads> or --threads <threads>     # 20 is the default

A higher number of threads usually increases the speed of the restore process but may impact the cluster’s overall performance (for client applications).

This option works differently based on the number of files being configured to be restored.

  • When restoring multiple files

The restore tool spins up the configured number of threads capped by the number of files. So, the number of threads is whichever is smaller: configured number of threads or number of files.

The reason is that each file is handled by one thread. So we cannot use more threads than files. In this scenario, handling of a file is completely local to one thread. File accesses don’t have to be coordinated between threads.

  • When restoring a single file

The restore tool spins up the configured number of threads. These threads all read from the same, single file. They use a mutex to make sure that only one thread at a time reads a record from the file.

Increasing the number of threads to a value too high could lead to mutex contention issues. Therefore, the performance is likely not going to be linear with the number of threads.

  • Alternative

If you have multiple backup files in the backup directory, you could also run multiple asrestore processes in parallel, each restoring a different file, using the following option:

        -i <path> or --input-file <path>       # The single file from which to read the backup.
                                               # - means stdin. Mandatory, unless --directory is given.
  • Examples
 asrestore -h 10.0.100.199 --input-file BB9375FFD005452_00000.asb --threads 100
 .
 .
 .
2018-06-29 21:44:38 GMT [INF] [13830] 0 UDF file(s), 0 secondary index(es), 248779 record(s) (22296 KiB/s, 248779 rec/s, 91 B/rec, backed off: 0)
2018-06-29 21:44:38 GMT [INF] [13830] Expired 248779 : skipped 0 : inserted 0 : failed 0 (existed 0, fresher 0)
2018-06-29 21:44:38 GMT [INF] [13830] 100% complete, ~0s remaining

Backup a 3 node cluster into 3 directories like this:

asbackup --node-list node1_ip:3000 -d  backup_node1
asbackup --node-list node2_ip:3000 -d  backup_node2 
asbackup --node-list node3_ip:3000 -d  backup_node3 

Restore them like this (assumming more than 32 files in each directory):

asrestore -h node1_ip -d  backup_node1 --threads 32 ...
asrestore -h node2_ip -d  backup_node2 --threads 32 ...
asrestore -h node3_ip -d  backup_node3 --threads 32 ...

Similarly, we can restore in parallel from S3 (Refer to direct backup to AWS S3):

s3cmd get s3://BUCKET/OBJECT1 - | asrestore --input-file - ...
s3cmd get s3://BUCKET/OBJECT2 - | asrestore --input-file - ...
s3cmd get s3://BUCKET/OBJECT2 - | asrestore --input-file - ...

Notes

  • Tools version 3.15.3.2 is actually by default slower to restore as the /etc/aerospike/astools.conf has the following entries:
#—————-Performance and throttling———————
nice-list = "1,10000"
threads = 32

Therefore, the output is like this which is throttled (10000 rec/s):

2018-06-29 21:49:02 GMT [INF] [14202] 0 UDF file(s), 0 secondary index(es), 209980 record(s) (896 KiB/s, 10000 rec/s, 91 B/rec, backed off: 0)
2018-06-29 21:49:02 GMT [INF] [14202] Expired 209980 : skipped 0 : inserted 0 : failed 0 (existed 0, fresher 0)
2018-06-29 21:49:02 GMT [INF] [14202] 84% complete, ~3s remaining

This issue was addressed in the subsequent release (tracked under TOOLS-1135). Upgrade to the latest tools or use the --no-configure option to workaround it.

Keywords

ASRESTORE SPEED SLOW PARALLEL RESTORE STUCK BACKUP ASBACKUP

Timestamp

07/29/2018