Client_write_error

Hello, I’ve got a map operation that attempts to remove an entry from a bin of type Map. In case the record doesn’t exist I contract an exception with error code ResultCode.KEY_NOT_FOUND_ERROR.

Such a scenario is normally acceptable from my business case perspective, so my app catches the error and proceeds. The only obstacle I face at the moment is a monitoring system which alerts me every time the server-side metric client_write_error value increases.

According to the Knowledge Base, there are a number of reasons for the metric value to grow. And it would be a bad idea not to monitor its value completely.

Is there a way to ask a server not to fail the request when a record doesn’t exist and not to increment the client_write_error metric for the cases when I expect the operation to fail?

Is there a way to distinguish cases when something goes out of my control (like write_queue depth exceeded or record too big or stop wrights threshold exceeded etc…) and normal cases when I expect a failure to happen?

Would you please recommend the best practices to monitor this metric?

Cheers, Mike.

1 Like

Good question, or I guess questions. As you pointed out, your application should have the most specific error code and message and should be able to distinguish between the different situations (device overload, capacity issue – stop writes, key busy, etc…). So to the question on distinguishing something that is out of control of the app, that should already be in place. Or is there a specific error code/message that is mixing things up?

From a monitoring perspective within the server itself, one would have to look at the other metrics you are pointing out (stop_writes true/false, write_q metrics, etc…). It would be too expensive (impacting performance) to have specific separate metrics for each individually.

Finally, I believe there are client side policies to have some CDT operations not fail but I am not sure whether they would inform whether or not to increase the client_write_error metric. We certainly don’t have client policies that would directly inform whether an xxxx_error metric should be ticked up or not.

Thank you @meher !

Yes, there is no problem whatsoever of distinguishing error codes at the client-side.

My main concern was that there was no differentiation between error codes in terms of the client_write_errors counter which we are looking at as the main metric to monitor the client to server communication issues.

Thank you, seems the be the only option for us.

I thought it could be enhanced by introducing a hashmap to track client_write_errors in context of error codes, allowing an ops engineer to rely on a specific set of codes he/she is interested in, couldn’t it? Like it’s perfectly done for a write_q counter which tracks a write queue value in the context of a device.

Unfortunately, there is no way to explicitly pass Flags to MapOperation#removeByKey method, unlike MapOperation#put method which allows the policy to be passed. It seems to be a really nice to have feature, doesn’t it?

Thank you, Mike.

This topic was automatically closed 6 days after the last reply. New replies are no longer allowed.

Yes, that may be useful indeed. Let me pass the feedback!