Client_write_error

Mikhail_Pryakhin · April 10, 2020, 2:15pm

Hello, I’ve got a map operation that attempts to remove an entry from a bin of type Map. In case the record doesn’t exist I contract an exception with error code ResultCode.KEY_NOT_FOUND_ERROR.

Such a scenario is normally acceptable from my business case perspective, so my app catches the error and proceeds. The only obstacle I face at the moment is a monitoring system which alerts me every time the server-side metric client_write_error value increases.

According to the Knowledge Base, there are a number of reasons for the metric value to grow. And it would be a bad idea not to monitor its value completely.

Is there a way to ask a server not to fail the request when a record doesn’t exist and not to increment the client_write_error metric for the cases when I expect the operation to fail?

Is there a way to distinguish cases when something goes out of my control (like write_queue depth exceeded or record too big or stop wrights threshold exceeded etc…) and normal cases when I expect a failure to happen?

Would you please recommend the best practices to monitor this metric?

Cheers, Mike.

meher · April 14, 2020, 11:19pm

Good question, or I guess questions. As you pointed out, your application should have the most specific error code and message and should be able to distinguish between the different situations (device overload, capacity issue – stop writes, key busy, etc…). So to the question on distinguishing something that is out of control of the app, that should already be in place. Or is there a specific error code/message that is mixing things up?

From a monitoring perspective within the server itself, one would have to look at the other metrics you are pointing out (stop_writes true/false, write_q metrics, etc…). It would be too expensive (impacting performance) to have specific separate metrics for each individually.

Finally, I believe there are client side policies to have some CDT operations not fail but I am not sure whether they would inform whether or not to increase the client_write_error metric. We certainly don’t have client policies that would directly inform whether an xxxx_error metric should be ticked up or not.

Mikhail_Pryakhin · April 15, 2020, 10:54am

Thank you @meher !

Yes, there is no problem whatsoever of distinguishing error codes at the client-side.

My main concern was that there was no differentiation between error codes in terms of the client_write_errors counter which we are looking at as the main metric to monitor the client to server communication issues.

Thank you, seems the be the only option for us.

I thought it could be enhanced by introducing a hashmap to track client_write_errors in context of error codes, allowing an ops engineer to rely on a specific set of codes he/she is interested in, couldn’t it? Like it’s perfectly done for a write_q counter which tracks a write queue value in the context of a device.

Unfortunately, there is no way to explicitly pass Flags to MapOperation#removeByKey method, unlike MapOperation#put method which allows the policy to be passed. It seems to be a really nice to have feature, doesn’t it?

Thank you, Mike.

system · April 21, 2020, 10:54am

This topic was automatically closed 6 days after the last reply. New replies are no longer allowed.

meher · April 21, 2020, 10:26pm

Yes, that may be useful indeed. Let me pass the feedback!

Topic		Replies	Views
Add flag to 'get', to suppress fail count Requests	0	1521	April 14, 2017
Couldn't find what cause client_write_error Tuning error	4	1414	December 19, 2020
Handling "write fail: queue too deep / Error Code 18: Device overload" on the client side Client Libraries	2	2157	January 4, 2020
How to simulate to see the values for client_read_error and client_write_error Monitoring	5	2603	June 13, 2017
Back pressure mechanism in case of DEVICE_OVERLOAD error Aerospike and other Databases	4	2331	December 23, 2019

Client_write_error

Related topics