How do I handle a planned network maintenance between XDR source and destination?

Aerospike_Knowledge · March 4, 2017, 2:35am

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

How do I handle a planned network maintenance between XDR source and destination?

Problem Description

In the event of XDR connectivity issues betweeen a source cluster (S) and one (among several) destination cluster (D1), the cross datacenter replication to the other destination clusters would potentially be impacted, causing the overall lag to increase.

Explanation

It’s important to understand and review the architecture of XDR before deploying in production.

XDR ships records in locksteps across all destination clusters. The failure to ship one record to a single destination will force the relog and a subsequent attempt to ship the record to all destination clusters. Not only do errors cause XDR to throttle, it also will unnecessarily re-ship relogged records to destination clusters that may have received it already. Also, a slow down in a link between the source cluster and one of its destination will slow down shipping to all destinations. In other words, XDR will be as slow as the slowest destination. Therefore, in case of temporary network slow downs (planned or not), or other issues impacting the normal shipment of records to a particular destination cluster, it may be necessary to drop such destination cluster in order to let XDR ship records to the other healthy destinations, until the issue is remediated to.

Solution

IMPORTANT NOTE: forcing a cluster into CLUSTER_DOWN state should always be carefully considered, as the digestlog would continue to grow in order to allow for entries to be processed when the cluster can receive records again:

As the digest log grows, the reclaim needs to search through more pages to find the global last ship time and move the start pointer. This happens every minute and will increase disk IO and slow down the digestlog reclamation process.
If a node at the source gets restarted while there is a lag, XDR starts processing entries from the start pointer even if it does not ship all of them, causing extra strain.
Forcing a destination cluster in CLUSTER_DOWN could cause stop_writes if the xdr-min-digestlog-free-pct has been configured.

We are assuming, in the following, that the unhealthy link is between S and D1, or that D1 is encountering some sort of temporary outage, other destinations being D2, D3, etc. The different options to handle this situation are as follows:

For versions 4.1 and above, the `force-linkdown` command can be used:

    asinfo -v "xdr-command:force-linkdown=true;dc=D1"

XDR will treat D1 in CLUSTER_DOWN state and will trigger window shipping[1] when “force-linkdown” is set back to false, ensuring that the records that were written at the source while D1 was in CLUSTER_DOWN state will be picked up by the window shipper thread.

For Versions prior to 4.1, you can use one of the following three methods:

1. Dynamically disassociate D1 from the remote destination for the relevant namespaces and un-seed all nodes from the source:

    asinfo -v "set-config:context=xdr;xdr-shipping-enabled=false"
    asinfo -v "set-config:context=namespace;id=<NAMESPACE>;xdr-remote-datacenter=D1;action=remove"
    asinfo -v "set-config:context=xdr;xdr-shipping-enabled=true"

Note: the suspend of shipping is for AER-5718. It is not neccessary if you are using the latest 3.13, 3.14 and 3.15+.

For newer releases, if no other namespace is associated with the DC, then the DC will be in INACTIVE state. For older releases, you will need to remove each node that was seeded:

    asinfo -v "set-config:context=xdr;dc=D1;dc-node-address-port=xx.xx.xx.xx:3000;action=remove"

This will trigger the source to treat the destination as INACTIVE and will not attempt to ship to it. The records written after D1 was taken down will be missing from D1 even after it comes back. State can be restored by backing up S and restoring to D1 after the cluster comes back up.

2. Bring down D1 by shutting down Aerospike service on all the cluster nodes.

XDR will treat D1 in CLUSTER_DOWN state and will trigger window shipping[1] when it comes back up, which will ensure that the records that were written at the source while D1 was down will be picked up by the window shipper thread when the cluster is restarted. The obvious drawback here is that the D1 cluser is also not available for other clients.

3. Use IPtables to cut the connection between S and D1.

Impact: XDR will treat D1 in CLUSTER_DOWN state and will trigger window shipping when the IPTable rules are reversed (after the outage of maintenance window), ensuring the records written between the IPTables enabling and disabling will eventually be shipped to D1.

Example rule for destination:

iptables -I INPUT -p tcp --dport 3000 -d 10.xxx.y.zz/32 -j REJECT
iptables -I INPUT -p tcp --dport 3000 -s 10.yyy.z.xx/32 -j REJECT

Block/unblock a node from joining a cluster

Notes

You always want to use REJECT for a quick result. IPtables REJECT will actively respond to the caller with icmp-port-unreachable by default, allowing the host to immediately take action on an unreachable connection. If you use DROP, this will simulate 100% packet loss instead, as packets will simply be silently dropped. This may then take a while for the caller to realise that the host is down by the fact the connections are timing out with lack of response.

Also check if any exception rules exist similar to the following that could prevent the blocking of ESTABLISHED connections. Such rules would need to be overwritten during this procedure.

iptables -I INPUT 1 -m state --state ESTABLISHED,RELATED -j ACCEPT

Reference

[1] More details on handling local node loss and remote node loss:

Applies To

Server prior to v. 5.0

Keywords

XDR destination force link down DC

Timestamp

06/28/2018

Topic		Replies	Views
About of aerospike XDR configurations XDR (Cross Data Center Replication) xdr	5	1367	July 20, 2021
About of aerospike XDR synchronize	2	1134	July 19, 2022
XDR Overhead on a Cluster XDR (Cross Data Center Replication)	4	2347	June 8, 2015
XDR or Rack Awareness for geo cluster XDR (Cross Data Center Replication)	5	2103	February 14, 2019
What happens in XDR if the network between 2 DCs is cut? XDR (Cross Data Center Replication)	0	1433	August 16, 2014

How do I handle a planned network maintenance between XDR source and destination?