Amount of data that is not available when M nodes go down

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

Synopsis:

Amount of data that is not available when M nodes go down

Use the following formulas to estimate the amount of data loss when more than the replication factor number of nodes leave the cluster:

Given –
N = number of nodes in cluster
M = number of nodes lost
R = replication factor
X! = X [factorial](https://en.wikipedia.org/wiki/Factorial)

Formulas

What portion of data is not available?

M! (N-R)!
-—-----——
(M-R)! N!

For Replication Factor of 2, this simplifies to:

M (M-1)
-—---——
N (N-1)


if R > M no data loss is expected

Explanation

The formula is actually the [permutations](https://en.wikipedia.org/wiki/Permutation) of having all replicas on the M nodes that go down, divided by all the permutations for replicas across all N nodes in the cluster.

Example

If two nodes leave a 10 node cluster with RF=2:

2(1)/10(9) = 0.0222 = 2.2%

2.2% of data is lost.

Timestamp

October 2020

1 Like