Stop Write on batch loading

Hi, Stop write is triggered with not apparent reason. I’m not sure what is going on as there is plenty of memory and storage available. This is triggered once I try to load data in a batch. Thanks for the help!

single namespace: namespace udb { replication-factor 2 memory-size 30G high-water-memory-pct 85

    storage-engine device {
        file /opt/aerospike/data/udb.dat
        data-in-memory true
        write-block-size 128K
        filesize 50G
    }

}

Can you check the server log for the node that is show stop writes true and see if you have hit clock skew stop write? All the server nodes should be sync’d to NTP server clock and their clocks should not drift beyond 40 seconds in AP mode or you will hit stop writes.

Thanks for the response, servers are using NTP so its not the time drift. Probably left out an important detail, I added 3 nodes (15-17) as the cluster was reaching its limits. While namespace counters showed that it is balanced I noticed that asd process on the existing servers was still holding to the memory and was at 91%. Once restarted asd released the memory and it’s consumption is now more aligned with namespace memory usage. Is this the expected behavior ? (not releasing unused memory until restart)

Memory used for data should be released. Memory used for storing Primary Index (64 bytes per record) is not released unless you restart the server and rebuild the Primary Index. (The way arena stage memory allocation works for PI.)

With data-in-memory true, you can also get fragmentation, which can then trigger stop-writes-sys-memory-pct which was introduced in the version you are running (6.3). You can check the heap_efficiency_pct metric. A restart will reload the data in memory and get the heap_efficiency_pct back to close to 100… Stay tuned for some big changes in this area (data in memory) later this year.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.