How Aerospike write data into disk


#1

Hi All,

When did Aerospike flushed data written into memory of all replica nodes into disk. If the write process to disk is eventual, how did aerospike tolerate during cluster failure, since there is no log mechanism as like Cassandra does


#2

Please see this post that covered the subject:


#3

Two new config parameters to control this are

flush-max-ms: This is number of millisecond server waits before it flushes unfilled buffers. default is 1000ms this can set using command

asinfo -v ‘set-config:context=namespace;id=test;flush-max-ms=1’

fsync-max-sec: In case of raw device open is called with O_SYNC so every write is flushed to the storage given we deal with our own page size. In case of using filesystem this governs the rate at which file gets synced. Default is 0 i.e never sync. It can be set using command

asinfo -v ‘set-config:context=namespace;id=test;fsync-max-sec=1’

Changing this may increase your io bandwidth requirements.

Aerospike has

  1. Is log structure on storage
  2. Does sync replication

It means last committed copy is always safe and the writes make to the write buffer on master and all the replica nodes before write is acknowledged to the user.

There is still a window where data can be lost e.g in case of entire cluster failure, ofcourse there are ways around it by doing Rack Aware replication or doing Cross DC replication. Only way to close this window would be to perform sync data or log flush to the storage (this is assuming storage is reliable enough) but this would be highly detrimental to performance.

Most of NoSQL data bases DOES NOT perform sync log or data flush including Cassandra which does aysnc commit log flush and has commitlog_sync_period_in_ms set to 10 seconds by default.

HTH – R