Defrag endless loop and hwm breath issue when cold restart


#1

I encounter a startup issue on Aerospike CE 3.9.1. I have a four nodes cluster, and use SSD to store data (no data in memory) , I use less then 70% of disk, and 70% memory, but the “available” (contig-free) only have 10%, so I plan to add some store files to aerospike and cold restart the cluster, also expecting the defrag to help me to save some resource.

After I config the conf file, I restart asd, but I fall in the defrag endless loop. waiting for defrag: namespace devices percent 0 waiting for 10 it seems the server hang. according to http://www.aerospike.com/docs/operations/troubleshoot/startup I lower the “high-water-memory-pct” and restart aerospike, but I meet another issue:

cold-start found no records eligible for eviction
hwm breached but no records to evict

and it seems the loading file process is hang, the percentage is stop increasing and cannot startup.

do you have any advice?


#2

Are all your records TTL=0 (Live forever)?


#3

Yes, so eviction is useless from my perspective, but it seems the eviction process make load process very slow. If I higher high water disk pct, the load process would be faster.