our cluster set replication-factor 1 , how to remove a node without losing data ?
If you run with replication-factor 1, you run a big risk of losing data. Whilst SSDs are very reliable these days, they do occasionally fail so it is strongly recommended to use RF >= 2. If you are using Aerospike as a cache to some other data, sometimes RF = 1 works for the use case.
If you can stop write traffic to the node, you can use asbackup with the -l parameter to specify the node to backup. Take a full backup of the node, take the node out of service and then use asrestore to write the data back into the cluster. The data from that node will be unavailable until it is restored from the backup.
Honestly, I would consider taking a system outage, stopping the nodes and bumping the replication factor to 2. This will offer durability in the face of node outages and allow normal operational procedures like rolling upgrades with no downtime. Of course, you might need additional hardware to accommodate the extra data / network utilization on writes.
thanks for your kindness help , yes we use aerospike as cache , so RF=1 is more budget-friendly , i just remind the function of cassandra nodetools decommission , so i think if aerospike can have this features , technically speaking i think that is possible, maybe aerospike can support this features for thouse people using RF=1 in the future
With enterprise you can just quiesce the node https://www.aerospike.com/docs/operations/manage/cluster_mng/quiescing_node/
thank you but we are community edition
This topic was automatically closed 6 days after the last reply. New replies are no longer allowed.