Hi. We’re currently using Aerospike CE in a high throughput environment and are seeing a few issues with timeouts and increased client connections.
Recently we checked to see what the webheads were seeing in terms of connections, and what the Aerospike server was seeing as connections which are wildly different on the whole as I shall demonstrate:
Server info:
PHP Version:
PHP 5.6.27 (cli) (built: Oct 15 2016 21:31:59)
Copyright (c) 1997-2016 The PHP Group
Zend Engine v2.6.0, Copyright (c) 1998-2016 Zend Technologies
with Zend OPcache v7.0.6-dev, Copyright (c) 1999-2016, by Zend Technologies
Client version:
aerospike
aerospike support => enabled
aerospike version => 3.4.14
server version is 3.10.0
and currently the number of client connections stands at around 36k
Config is set with .shm=true
and connections are persisted.
As this stands right now, this is the command I’m running on Aerospike server:
$ lsof -i -a -p 3637 | grep prdweb14 | wc -l
this returns the following value:
3494
If I then run the following command on one of our webheads (the same one referenced above - prdweb14
) this is what I get:
$ for i in $(ps -ef | grep php-fpm | awk '{print $2}'); do lsof -i -a -p $i | grep aer01; done | wc -l
yields:
37
If I restart php-fpm and check again, basically the value from aerospike server 01 goes down by 20, so remains at 3474
and then when I start fpm again, it goes back up by 20 + a few to 3496
for example and prdweb14
reports around 22
. This we see right across the board, a discrepancy between aerospike and webhead numbers massively, and would also account for the constantly rising connections we see in Aerospike AMC. In addition we are seeing timeouts with a timeout value of 5 seconds which I wouldn’t expect.
Is this something which has been acknowledged and is a known issue?
Anything further you need let me know and I can provide the data.
thanks in advance for any help.