Durability issue: lost set after 3-4 days


#1

I had stored 10000 records in the set(in test namespace). After 3-4 days, That set is lost automcatically, without any delete operation. Secondery index is still there in the index list. Even Sometime it shows different no of objects after server restart.

Running aerospike with virtualbox on windows 8 system. Aerospike version is 3.3.12


#2

This sounds like there is not enough memory/storage allocated for the namespace, and automatic record eviction has kicked in to make room for newer records.

http://www.aerospike.com/docs/operations/manage/storage/


#3

Thanks for the reply. Below is my AMC console. It looks storage is available. Can you verify it. Seems something wrong. How to prevent automatic record evication If size is available.


#4

Could you clarify. Are you saying you installed AS, added records, left the server alone for 3-4 days, came back and stuff was gone and you had done nothing?


#5

Yes, It is like that. I had inserted 100000 records in set to test performance and performance result was Excellent. I didnt touch those records for few days. Suddenly those records disappered. Even couldnt see set in definition in list . I tried this test 3-4 times. I restart my virtual box daily. db runs on regular hdd (not ssd).


#6

Would it be possible for you to post your config file. Iā€™m wondering what ttl you are using on your records?

ā€“Lucien


#7

Yes I can provide config file. Where It would be stored in windows installation.

I am using below code to store to data ClientPolicy clientPolicy = new ClientPolicy(); AerospikeClient client = new AerospikeClient(clientPolicy, new Host(host,port)); WritePolicy policy = new WritePolicy(); //key and bin setup client.put(policy, key, bins);


#8

I was curious on your server aerospike.conf file which should be under /etc/aerospike/aerospike.conf on the Virualbox image. Also can you confirm your default-ttl settings and storage-engine?

More info on config can be found at:

http://www.aerospike.com/docs/operations/configure/namespace/

Since you are running on regular hd and not ssd. Iā€™m assuming you are using data-in-memory similar to this stanza



namespace [namespace-name] {
    memory-size  SIZE G             # Maximum memory allocation for data and
                                    # primary index
    default-ttl 0      # Writes from client that do not provide a TTL
                               # will default to 0 or never expire
    storage-engine device {         # Configure the storage-engine to use
                                    # persistence
    file /opt/aerospike/filename  # Location of data file on server
    # file /opt/aerospike/another # (optional) Location of data file on server
    filesize SIZE G                # Max size of each file in GB
    data-in-memory true             # Indicates that all data should also be
                                    # in memory.
    }
}

http://www.aerospike.com/docs/operations/configure/namespace/storage/

If you can post the stanzas from the config file that would help troubleshoot issue.

ā€“Lucien


#9

Sorry Lucien, I tried to find .conf file but couldnt find it in the system

here is my server startup console log, may be it can be helfpul.

E:\aerospike\database\aerospike-vm>vagrant up Bringing machine ā€˜defaultā€™ up with ā€˜virtualboxā€™ providerā€¦ ==> default: Checking if box ā€˜aerospike/centos-6.5ā€™ is up to dateā€¦ ==> default: Clearing any previously set forwarded portsā€¦ ==> default: Clearing any previously set network interfacesā€¦ ==> default: Preparing network interfaces based on configurationā€¦ default: Adapter 1: nat ==> default: Forwarding portsā€¦ default: 3000 => 3000 (adapter 1) default: 8081 => 8081 (adapter 1) default: 22 => 2222 (adapter 1) ==> default: Running ā€˜pre-bootā€™ VM customizationsā€¦ ==> default: Booting VMā€¦ ==> default: Waiting for machine to boot. This may take a few minutesā€¦ default: SSH address: 127.0.0.1:2222 default: SSH username: vagrant default: SSH auth method: private key default: Warning: Connection timeout. Retryingā€¦ ==> default: Machine booted and ready! ==> default: Checking for guest additions in VMā€¦ ==> default: Mounting shared foldersā€¦ default: /vagrant => E:/aerospike/database/aerospike-vm ==> default: Machine already provisioned. Run vagrant provision or use the --provision ==> default: to force provisioning. Provisioners marked to run always will still run.


#10

Ah looks like you are using vagrant. If this is the vagrant image from Aerospike then your conf file will look like this.


namespace test {
    replication-factor 2
    memory-size 1G
    default-ttl 5d # 5days, use 0 to never expire/evict.

# storage-engine memory

    # To use file storage backing, comment out the line above and use the
    # following lines instead.
    storage-engine device {
        file /opt/aerospike/data/test.dat
        filesize 5G
        data-in-memory true # Store data in memory in addition to file.
    }
}

default ttl is set to 5days. Which might explain some of your data expiring. Feel free to update config to either a value of zero for never expire or increase default expiration. You can also set ttl from the client code you used to insert the data.

best, Lucien


#11

Thank you very much Lucien. I am still finding config file :frowning: . Will it be in my local sytem?


#12

It should be in your Linux VM.

You can access the vm as follows.

vagrant ssh

then at the linux prompt run the following:

cat /etc/aerospike/aerospike.conf

Hope this helps,

More info on our configurations files can be found at:

http://www.aerospike.com/docs/operations/configure/

Do you have a specific use case that you are trying to test? It may be worth trying Aerospike on Amazon EC2.

best,

Lucien


#13

Thanks. I am testing performance with large data for my application. Like performance during search with indexd column. Looks good. Few limitation like cant filter on multiple bins though it has multiple filter in java client api. It can be better If it has UI client like other sql clients(sqlyog etc)


#14

again faced strange problem. I setup aerospike on local linux ubuntu system. It is set with test data. Cleandup existing sets and imported new data in sets with java client. Verified imported data which was ok. restarted system due to hang. After restart it revert back to previous data and lost the new imported data.

Here is my config.

namespace test { memory-size 4G # Maximum memory allocation for data and # primary and secondary indexes.

storage-engine device {         # Configure the storage-engine to use
                                # persistence.
file /opt/aerospike/test.dat  # Location of data file on server.
# file /opt/aerospike/<another> # (optimal) Location of data file on server.
# device /dev/<device>          # Optional alternative to using files.

filesize 4G               # Max size of each file in GB.
data-in-memory true            # Required true by data-in-index.
}

}


#15

Couple of questions. Was the entire server rebooted? How did you cleanup the existing sets? I also noticed that there was no default-ttl set in your config which would default to 0 meaning never expire.

Basically what may be happening is older data that were on disk being re-indexed into memory as part of a cold-start. Please see the following article for ore info:

For your particular case, you may need delete /opt/aerospike/test.dat prior to loading new test data.


#16

rebooted entire server. cleandup sets by delete method of java client. Thanks, can set default-ttl to 0, will that solve the problem?

As per link, Seems problem remains there. Not sure but is it problem in community edition

If we delete test.data then It might remove entire data sets. Sometimes we need to clean some sets only so can lost othere data.

Is it like with hard restart(entire server reboot), chaces of data lost. It can not be 100% same data after hard restart?


#17

Default-ttl is set to 0 by default if it not harcoded in the config file. Which means that records are set to not expire unless overwritten by the application.

Its not really a problem, but more of how aerospike handles deletes to ensure high performance. For more details please read Deletes More infoā€¦

In your particular case you should be able to use Method#4 to clean old data and ensure that your test sets does not return. Please see: https://discuss.aerospike.com/t/bin-names-deletion/731

You can also use Method#1 by setting a default-ttl > 0 and wait for that data to expire prior to restarting the node. This method assumes that record was not initially set to a default-ttl of 0.