Automatic backup and restore of Aerospike namespaces


#1

I have a namespace ‘test’ in Aerospike. I can easily creates it backup and restore it using the following commands:

Backup:
asbackup -h localhost -n test  -d /home/asif/aerobckups -r

Restore:
asrestore -d /home/asif/aerobckups

But I need to explicitly run these commands. I want to do this operation in a way like: I have set TTL 30d in namespace. As TTL reaches so on 30th day it automatically creates backup and restore namespace (I will make backup and restore on different server). Is there any way to do it? if it is available as a property in namespace then it will be very nice but suggestions related to init.d are also welcomed.


#2

Hi, Let us know why you want to move your about to expire data to a different cluster? In this case backup/restore will not helpful. Backup remembers TTL of records. In your example even if you will move your records which has TTL on 30th day to another cluster, they will expire after 1 day.

Please give details of your use case. So that we can suggest you better solution.


#3

Mr. Jyoti, Thank you so much for your response. Actually scenario of my requirement is that :

We have an iOS app and receive push notifications in 1000s on daily basis.

  1. We are saving data of these push notifications in Aerospike, as they are coming to our servers.
  2. I have set 30d as TTL of namespace on my current server, as it cannot load too much data due to less hardware.
  3. So for the above scenario, my server should empty after every 30 days.
  4. But I need to backup all previous data as well on another server because it can be required for my analysis part.
  5. I was thinking in a way that when it reaches to 30d so it automatically make backup of all current data on my another server and thus current server will be empty and another server has all data as backup.
  6. I am thinking that may be any property while defining namespace can help me or may be it possible through init.d script.
  7. But, if none of the above solutions exist so kindly suggest me that how to manage it.
  8. I can also copy paste my ticket subject here, that gives u more clear idea about my requirement:

While the number of incoming objects increases, we need a scheme to move “older” records into an Aerospike archive, external hard disk, cloud storage. This will free the main memory based Aerospike setup from unused older entries while we are still able to analyze older data on request.

Please define and evaluate a scheme, which automatically moves out older records. Also propose a backup media solution.


#4

Hi, Unfortunately you will not get your expected solution out of aerospike server. But one solution I can suggest you for your requirement is as follows:

  1. Write one application to delete data from existing cluster and writing same record to other cluster with infinite TTL. You can use scan API for iterating through each records.
  2. You should run this application in every 3/4th or 4/5th times of your TTL. e.g. if TTL is 30 days then you should run this on 20/25 days. And pick those records which has remaining TTL as less than 10/5 days.
  3. Backup/restore option will not helpful because you can’t change the TTL.

#5

Hasafa,

Most people who need long term storage of data for analysis purposes insert that data immediately in a long term analytic store. This has the benefit of having all the data immediately in one place, and avoids the problem of needing to copy data that is “about to expire”.

Aerospike now supports incremental backup, and backup of objects which have TTLs at certain values. This filtering functionality is available both in the API layer ( PredExp ) as well as through the asbackup functionality.

In general, given that there are a broad range of functions required by databases, we expect an administrator to create scripts that run on a regular basis. As Mr Jyoti mentioned, you may need to take actions like manipulate the TTL after the fact, or transform the data on insertion to a long term store. These kinds of custom operations are best done through tools that specialize in data movement, backup / restore, and transformation.