Aql client overwhelmed by "WARN AEROSPIKE_ERR_TIMEOUT"

pgupta · February 7, 2017, 6:19am

Looking at netstat options, in linux, I would try: $sudo netstat -tulpn | grep :6000

billbargens · February 7, 2017, 6:23am

On nodeB:

sudo netstat -tulpn | grep :6000 tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN 104042/asd

also:

netstat -anlp |grep 172.17.42 tcp 0 0 172.17.42.1:5000 10.168.17.12:32162 SYN_RECV -
tcp 0 0 172.17.42.1:5000 10.168.17.12:32947 SYN_RECV -
tcp 0 0 172.17.42.1:5000 10.168.17.12:32558 SYN_RECV -
tcp 0 0 172.17.42.1:6000 10.168.17.12:13871 SYN_RECV -
tcp 0 0 172.17.42.1:6000 10.168.17.12:13089 SYN_RECV -
tcp 0 0 172.17.42.1:6000 10.168.17.12:13475 SYN_RECV -
tcp 0 0 172.17.42.1:6000 10.168.17.12:14251 SYN_RECV -
tcp 0 1 ::ffff:172.17.42.1:33337 ::ffff:172.17.42.1:5000 SYN_SENT -
tcp 0 1 ::ffff:172.17.42.1:14640 ::ffff:172.17.42.1:6000 SYN_SENT -
udp 0 0 172.17.42.1:123 0.0.0.0:* -

pgupta · February 7, 2017, 6:25am

Just do sudo netstat -tulpn without the grep and see if you can find the PID associated with this rogue ip

billbargens · February 7, 2017, 6:31am

sudo netstat -tulpn > netstat.log grep “172.17.42.1” netstat.log udp 0 0 172.17.42.1:123 0.0.0.0:* 92374/ntpd

pgupta · February 7, 2017, 6:46am

What process is it that is showing the SYN_RECV state with this rogue ip? PID/Program Name column?

billbargens · February 7, 2017, 6:56am

It seems that no program name is displayed!

sudo netstat -anlp |grep 172.17.42 tcp 0 0 172.17.42.1:5000 10.168.17.12:58841 SYN_RECV -
tcp 0 0 172.17.42.1:5000 10.168.17.12:59256 SYN_RECV -
tcp 0 0 172.17.42.1:5000 10.168.17.12:58444 SYN_RECV -
tcp 0 0 172.17.42.1:5000 10.168.17.12:59664 SYN_RECV -
tcp 0 0 172.17.42.1:6000 10.168.17.12:40102 SYN_RECV -
tcp 0 0 172.17.42.1:6000 10.168.17.12:40936 SYN_RECV -
tcp 0 0 172.17.42.1:6000 10.168.17.12:39717 SYN_RECV -
tcp 0 0 172.17.42.1:6000 10.168.17.12:40515 SYN_RECV -
udp 0 0 172.17.42.1:123 0.0.0.0:* 92374/ntpd

billbargens · February 7, 2017, 7:01am

I recall that before the abnormal aql exceptions happended, I deploy a docker on nodeB.

was it possible to influenced the aerospike cluster?

pgupta · February 7, 2017, 7:05am

I don’t know docker but that IP address is tied to docker! Google it. I think it is somehow interfering with the aerospike cluster when it binds 172.17.42.1 to port 6000 - which I don’t quite know how it is doing because asd is listening on any/6000. May be related to aerospike’s “reuse-address” default true. Will research more and will let you know if I have other ideas.

Initial discussion in this link may help shut docker and clear the IP address. https://support.zenoss.com/hc/en-us/articles/203582809-How-to-Change-the-Default-Docker-Subnet

kporter · February 7, 2017, 8:26am

Docker does create virtual ethernet interfaces, seems that is likely where nodeb learned the address. There isn’t a way to purge the advertised address without restarting the node. Obviously this assumes the interface is gone, run ip addr to verify.

If you do not need the clients to connect over both interfaces, consider setting the access-address configuration I mentioned, before restarting.

billbargens · February 8, 2017, 12:01pm

@pgupta @kporter

OK, the problem is solved by just following the instructions in https://support.zenoss.com/hc/en-us/articles/203582809-How-to-Change-the-Default-Docker-Subnet

Only 3 commands:

iptables -t nat -F POSTROUTING ip link set dev docker0 down ip addr del 172.17.42.1/16 dev docker0

Thank you so much for your help!

Topic		Replies	Views
Client timeout error on Ubuntu 14.04 AQL	4	3458	April 9, 2015
Intermittently network failure	9	1538	May 20, 2017
Aerospike_err_timeout - c-3.11.0.2 Tuning	1	1152	October 14, 2018
Timeout error with AQL	2	1697	November 11, 2015
Context about three, somewhat commonly seen, Aerospike errors Node.js Client error , message , record	5	3524	September 29, 2015

Aql client overwhelmed by "WARN AEROSPIKE_ERR_TIMEOUT"

Related topics