FAQ - Common asbackup and asrestore questions

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

FAQ - Common asbackup and asrestore questions


This article discusses frequently asked questions about asbackup and asrestore.

1. asbackup FAQ

How to tune asbackup

The priority option dictates the level of scan concurrency on each individual node. Higher values mean faster backups (relative to other scan jobs running on the system). For higher priorities, do monitor the read/write performance of your application to ensure the impact on the cluster performance is acceptable. Allowed values: 0 (auto), 1 (low), 2 (medium), 3 (high). The default value is 0. Any value higher than 3 is taken as a scan with medium priority.

asinfo -v 'jobs:module=scan;cmd=set-priority;trid=<jobid>;value=3

This is the size of scan thread pool. It can be dynamically increased or decreased.

Warning: The maximum recommended value is to match the number of cores on the host. Increasing past this may impact regular transactions performance.

Does asbackup impact ongoing transactions?

asbackup is basically a scan job so a properly sized cluster should experience minimal impact unless scan-threads has been increased above the number of cores on the host (see above). Tuning the tree sprigs may help if the number of records per partition is large (1 million records per partition and above typically).

What is the size of an asbackup file?

The size of an asbackup file depends on the amount of data to be backed up. asbackup splits files by default at 250MiB (configurable through the --file-limit option). Once the backup file reaches that size, another file will be created. There should be no size limit for asbackup other than bound by the OS or ulimit. (Prior to version 3.6.1, this would have been dictated by the size of records as the old asbackup would create a new file after each 200,000 records.)

Is there a configuration parameter to compress the backup data?

There is no direct parameter to compress the data but the output can be piped to any compression tool:

asbackup --output-file - [...] | gzip -1 [...] > backup.asb.gz

Why is my backup file size different from my disk usage?

The disk usage cannot be directly compared with a backup file size (nor the sum of all backup files).

The first, obvious difference would be the replication factor, as data stored will have one or more copy based on the replication factor, and the backup will only have master records.

The other main difference, impacting small records, is the overhead involved in storing data in an Aerospike database. Refer to the capacity planning guide for details, but in short, Aerospike stores data in blocks of 128 bytes (16 bytes for versions 4.2 and above) and has some overhead on a per record and per bin basis. For example, while a record size may be 141 bytes the overhead on each record is 64 bytes, and each bin will add 28 bytes; etc. Taking the sum of that overhead into account can sometimes get into higher multiples of 128 bytes (16 bytes for versions 4.2 and above). Finally, the records value are base 64 encoded when backed up, which will also impact the size.

Backup does not complete when using option -d instead of -o

This could be due to hitting the inode limitation of the filesystem. The number of free inodes can be determined using the “df -i” command. For example:

df -i
Filesystem            Inodes IUsed   IFree IUse% Mounted on
                     1148304 47160 1101144    5% /
tmpfs                 475448     1  475447    1% /dev/shm
/dev/vda1             128016    50  127966    1% /boot

In the above case, 1 million files or directories can be created in the / partition but the /boot partition has room for 127,966 only. By default, with the -d option, asbackup will create backup files of 250MiB each. 250GiB of data will require 1000 files and 2.5TiB of data will require 100,000 files.

Using the -F option, the backup filesize can be increased in order to reduce the number of files necessary. For example: -F 1024 can reduce the number of files by a factor of 4 compared to the default 250MiB.

What permission does a user need to be able to backup a namespace in a security enabled cluster?

To use asbackup on a namespace, a user only needs the read permission.

Can daily or hourly backup be taken?

Yes, refer to the incremental backup article for further details. (Not available prior to version 3.12.)

Does anything else other than records get backed-up with asbackup tool by default?

UDFs and secondary indices definitions are backed up along with the records by default. There are selection options to not include UDF or records (only backup meta-data) as well if needed. Refer to the Data Selection Options documentation for details.

Other metadata in the SMD folder such as truncate, and any defined user and roles do not get backed up. You can copy the security.smd file, which contains all user and role metadata, to the cluster the backup is going to be restored to, or re-create the definitions there. Roster, eviction, and truncate related metadata are cluster specific and thus may not applicable to be backed-up, and UDFs and secondary indexes are already handled by asbackup.

Tombstones also do not get backed-up as their main purpose is to avoid durably deleted records from potentially resurrecting back when nodes are cold-started.

2. asrestore FAQ

How to tune asrestore

This controls the number of threads spawned to write to the cluster. The upper limit will be the number of backup files. The speed of the asrestore processing the files would depend on the disk I/O, CPU load, network bandwidth between asrestore client and the server and even the inter-node network.

How many threads are used by the asrestore process?

When restoring a backup using asrestore the number of threads can be specified. This figure is a ceiling figure. There will be no more threads than there are backup files to be restored. Therefore, the number of files produced by the backup may have an impact on the performance of the restore process, even with the same number of threads specified (–threads option).

Since, the maximum number of threads allowed has increased to 4096 from 100.

How does asrestore use RAM?

During the restore asrestore reads backup files into memory in chunks of 16MiB. (Before version 3.6.1, asrestore read the complete backup file into RAM before restoring from RAM.)

Does the disk type used to store the asbackup file have an impact on performance?

Yes. The asrestore process is capable of being highly parallelised with multiple threads working at any one time. For an SSD drive this is a natural way to work and the process is very efficient. Rotational drives, which are dependant on the head of the drive being at a particular physical location, are far less able to deal with highly parellelised applications and therefore perform worse with asrestore.

How to debug asrestore if it is slow or seems to be stuck?

The --verbose option will increase the log output.

Some messages may still give the appearance of asrestore being stuck, though. For example the “Triaging 42804 backup file(s)” message, indicating asrestore is still reading and parsing each file may take longer to scroll.

In general, it is best to keep number of files smaller than 100K to avoid additional disk IO when number of files are huge in a single directory. See the next question.

What happens if asrestore fails to insert a record?

If the initial insertion of a record fails asrestore will retry inserting the record 10 times. Between tries there is a 1 second pause and the error is written to the logs at the debug severity level. If a record is not written after 10 tries, the restore is aborted with the error message “Too many errors, giving up”. The specific errors in which a retry is not attenpted are “record exists” (if --unique is used), “generation mismatch” (unless --no-generation is used) and “invalid username or password”.

Will partitions backed up from a single node stay on one node when restored in a different cluster?

Very likely not. The algorithm that assigns partitions to nodes uses the node-id, which by default is based on the MAC address of the node’s network interface; even if the two clusters have the same number of nodes, the chance that the partition assignments will line up the same way is small. The only case in which this could happen is if the nodes have manually-assigned node-ids which are the same on both clusters.





May 2020