FAQ - Why is quiesce a node not working as expected?
When following the standard quiescing operating procedure, the client still get exceptions when the node is shut down, defeating the purpose of quiescence.
One potential cause for such situation is when the client is not able to properly refresh its partition map. This would cause the existing partition map to be used and the quiesced node to be accessed, causing proxies and causing errors when the quiesced node is shut down. It is therefore important to check the network and/or potential issues with older client libraries.
There is another situation where a quiesced node would be hit: if the cluster is using rack-aware and the client is using the
PREFER_RACK ReplicaPolicy. In such cases, against older server versions, the client will continue hitting the quiesced node which would then proxy to the right node, causing errors and connection exception when it is shutdown.
Version 220.127.116.11 addresses this issue:
[AER-6078] - (BALANCE) Working masters (also) report ownership of appropriate replicas in client partition map (e.g. to optimize rack-aware client reads in certain situations).
In addition, there are a couple of issues related to quiescing causing redundant migrations:
[AER-6012] - (MIGRATION) For AP namespaces, there may be redundant migrations when quiescing multiple nodes at once and later shutting them down one by one. Fixed in version 18.104.22.168. [AER-6035] - (BALANCE) For AP namespaces with `prefer-uniform-balance` true, there may be redundant migrations after shutting down a quiesced node. Fixed in version 22.214.171.124.
QUIESCE MIGRATION CLIENT EXCEPTION