Stop Writes

Is there any way writes can be stopped completely during a network partition of any kind. Is there any such option.

Reads should still be supported.

For writes, I want my system to be CP and for reads it should be AP.

Basically don’t want a situation where reading from node A gives a different value than reading from node B. Is there any other situation when this can happen except a partition.

3 Likes

Unfortunately, we dont have such option today. But we want to provide CP mode in the future with the behavior that you just mentioned. However, there are no immediate plans of doing this. So, you cannot bank on this feature for the short term.

By default, our clients send the read requests also to master. So, in a normal situation, the read should not even go to node B. If at all the read request goes to node B, I can think of an extreme scenario where it can read different value. It is a race condition between readers and writers. The writer comes and updates node A, its yet to replicate to B. Before the replication to B is over if the client comes and does a read from both the nodes, it will get different values. This is because Aerospike by default provides only “committed read” (isolation level-1).

You can overcome this sitation, if you are willing to pay the cost and think “repeatable read” is more important for you. The clients API takes “consistency level” as a parameter in the policy (Ref java doc). If you choose consistency level as “all”, then all the replicas are consulted for every read and the latest one will be returned. Obviously, there is a higher cost to pay for this.

Thank you @sunil for the reply.

Just for clarification

If there are 7 nodes

A B C D E F G

After Partition:

           |
     A B C | D E F G
           |

No since my app layer is not partitioned and it can see both clusters.

Can you please clarify whether writes can be stopped in such a case.

And how does the client behave? Does it send writes/reads to both clusters?

The long-term plan for CP mode is to stop such writes. We cannot stop such writes today.

Today, the following will happen… There will be one master for the partition in ABC cluster and one master in DEFG cluster. But the client will go to both the clusters node-by-node and ask who is the master for this partition. Whichever node replied positively the latest to the ownership of the partition will be remembered. Say, the client asks ABCDEFG nodes in that sequence for the partition ownership, and B & F responded that they are master for p1, the client will remember F as the master for p1.