Improving Read and Write Performance During Migrations

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

There will be increased write load during migrations, which will result in increased read/write latency. This increased latency is the result of the server running “on-demand” duplicate resolution for maintaining consistency. If latency is critical and consistency requirements can be relaxed, it is possible to disable “on-demand” duplicate resolution and get enhanced read/write performance.

To disable duplicate resolution, run the following command on all the nodes:

asinfo -v "set-config:context=service;write-duplicate-resolution-disable=true" -h [host ip]

Hello, guys!

Could you put more detail in the feature explanation please? And you said about reading slightly outdated records, but how can I measure this time lag?

Regards

As of Aerospike 3.7.3. (AER-4542) write duplicate resolution’s impact to performance has been significantly improved.

There isn’t a way for the server to know how old its copy is without reading from other servers that may have a copy, which is what this feature disables. Disabling can result in new writes being overwritten by incoming migrations, or new writes overwritting a higher generation copy on a replica write.

Have any reservations is a strobg sign that the cost of disabling this feature is too high.

Thank you for the quick reply, I got it. So a client can receive an inconsistent response until the migration has completed (with write-duplicate-resolution-disable=1).

No, that is different. During migrations clients will be able to read stale data regardless of this setting. This setting prevents you from performing an update based on that stale data while a cluster is recovering.

To prevent reading stale data when there are multiple versions of the same partition see read-consistency-level-override. Also discussed here: Understanding Consistency Level Overrides.

I’m sorry for the late reply.

So the algorithm looks like that: if write-duplicate-resolution-disable=0 and a migration is in progress then during any update a cluster tries to collect all the records (with the given key) from all the nodes and updates the most recent one (or all of them, it depends on write-consistency-level). Please correct me.

I used to think that if write-consistency-level=all or read-consistency-level=all (the both of them are not required) then a client can’t read a stale data in any case. Is it correct?

I believe these questions are answered here: Understanding Consistency Level Overrides.