Is there a limit on replication factor? Can a large replication factor cause issues?

ashishbhutani · September 30, 2015, 8:47pm

Hi

We have a certain situation in which all of our data is stored on all the nodes. This is mainly to handle hotspotting. We are keeping replication factor same as the number of nodes for achieving this.

Our cluster size will be somewhere around 15-20 nodes. And our operations will be mostly reads (99:1 RW ratio)

I just wanted to check if it’s advisable to have such a large replication factor. Can it cause some problems ?We are majorly concerned about stability problems like crash, cluster-integrity etc.

Kindly throw some light on this. Many Thanks.

kporter · September 30, 2015, 11:11pm

We have a production user doing full replication in a 5 node cluster and have been running for ~ 1 year this way. Their dataset being replicated this way is only updated once a day in their case. I’m not aware of any stability issues with large replication factors, but it is a far less deployed configuration.

ashishbhutani · October 2, 2015, 7:56am

Thanks @kporter

What’s the expected behavior when I bring down few nodes in such a setup? Say 15 nodes with 15 replication factor and then 2 nodes are brought down. Should it trigger any migration ? I am not seeing that actually in my setup when I brought down 2 nodes. And not able to understand if that’s correct behavior.

kporter · October 5, 2015, 7:00pm

Interesting, I hadn’t really considered this before but yes, I wouldn’t expect migrations when you remove a node in a fully replicated cluster. When you add these nodes back there will be migrations; the amount of migration could be improved by wiping these nodes before they return.

Topic		Replies	Views
Explanation of replication-factor config parameter Configuration	4	4374	September 24, 2015
Cluster (Error: (1) unstable-cluster)	21	3599	May 23, 2019
Replicas distribution in the cluster	2	704	January 16, 2019
Rack-Aware Feature is not currently available for Replication Factor greater than 2 Configuration	5	1999	June 1, 2015
How to control data distribution to restricted number of nodes	3	1827	July 10, 2015

Is there a limit on replication factor? Can a large replication factor cause issues?

Related topics