Discrepancy between FPM connections and Aerospike connection


#1

Hi. We’re currently using Aerospike CE in a high throughput environment and are seeing a few issues with timeouts and increased client connections.

Recently we checked to see what the webheads were seeing in terms of connections, and what the Aerospike server was seeing as connections which are wildly different on the whole as I shall demonstrate:

Server info:

PHP Version:

PHP 5.6.27 (cli) (built: Oct 15 2016 21:31:59) 
Copyright (c) 1997-2016 The PHP Group
Zend Engine v2.6.0, Copyright (c) 1998-2016 Zend Technologies
    with Zend OPcache v7.0.6-dev, Copyright (c) 1999-2016, by Zend Technologies

Client version:

aerospike

aerospike support => enabled
aerospike version => 3.4.14

server version is 3.10.0 and currently the number of client connections stands at around 36k

Config is set with .shm=true and connections are persisted.

As this stands right now, this is the command I’m running on Aerospike server:

$ lsof -i -a -p 3637 | grep prdweb14 | wc -l

this returns the following value:

3494

If I then run the following command on one of our webheads (the same one referenced above - prdweb14) this is what I get:

$ for i in $(ps -ef | grep php-fpm | awk '{print $2}'); do lsof -i -a -p $i | grep aer01; done | wc -l

yields:

37

If I restart php-fpm and check again, basically the value from aerospike server 01 goes down by 20, so remains at 3474 and then when I start fpm again, it goes back up by 20 + a few to 3496 for example and prdweb14 reports around 22. This we see right across the board, a discrepancy between aerospike and webhead numbers massively, and would also account for the constantly rising connections we see in Aerospike AMC. In addition we are seeing timeouts with a timeout value of 5 seconds which I wouldn’t expect.

Is this something which has been acknowledged and is a known issue?

Anything further you need let me know and I can provide the data.

thanks in advance for any help.

[poke: @rbotzer and @kporter]


Complications during and after set deletion (AER-4890)
#2

Hey @Crags - there’s work being done right now on persistent connections, which you’re using. Once the next release of the PHP client is out please check it out and give your feedback.


#3

Thanks @rbotzer - hopefully it’ll resolve the issues we’re seeing. We have set up a stage environment which is running PHP7 and this seems to be more accurate, but hasn’t been running long enough to see if this is an issue over time or not.

Any ETA on the next client release at all?

Appreciate the reply :slight_smile:


#4

Hi @Crags. What type of Aerospike operations are being performed on your webheads?


#5

Also, does running tcptrack for port 3000 on your Aerospike server give any insight into the connection state and idle time of the connections?


#6

Okay, so so far this issue seems to be resolved in that if we kill FPM on one machine and check connections to and from aerospike they both report 0 - so this looks like the server / client is no longer holding connections.

Thanks everyone