We are trying to deploy a large aerospike cluster. We are using php-fpm and it seems that all php threads create their own connections. Now we have 1000+ threads on a single server, which means the cluster ends up with same many number of connections. As a result scaling is hard, since we have nearly 200 of those servers that need to connect to the aerospike cluster.
I was wondering if we have to enable some specific configuration to reuse/pool connections?
Each PHP process will have to create its own connection to each node in the cluster, as it is a shared-nothing environment. This is similar to the common web-application deployments of Ruby (Rails, Sinatra) or Python (Django, Flask).
On the PHP side, one thing that can be shared is the cluster tending, that is the thread that checks on the cluster for changes. See the doc for the PHP configuration group aerospike.shm.*
Otherwise, you may want to use the FPM configuration to limit how many PHP processes you are using to control the connections. On the server side the number of connections that the server handles is tunable (proto-fd-max). In general, the connections are very lightweight - it’s a file descriptor on the server, and doesn’t have high per-connection resource consumption (as opposed to something like an RDBMS that has to have memory allocated for query execution and cursors).
By the way, we are working on an HHVM client that will implement (at first) the same API. With faster execution you should be able to handle more requests with less processes, and therefore this issue becomes less relevant. You probably can test and see how many processes you actually need for your case. At least the database operations should make for a faster request execution. It is a good idea to use persistent connections in FPM to handle more requests per-process before it gets shut down and a new one spun up.
Yea, we had to lower our number of procs. The problem is during peak loads we have actually seen that number shoot up to hundreds, and that worries us. We have also bumped up the proto-fd-max, so we should be fine for now.
We got the same issue for php client.
In our situation, we have 5 aerospike servers and 10 php servers. On each php server, we have 500~1000 php-fpm processes.
After deployed on php server with php-client-3.4.3, the aerospike servers got very high CPU usages (1200% above on 24 cores machine), even there are very few active write requests.
To start, if you have that many processes (5000 - 10K) they definitely should be using shared-memory cluster tending (aerospike.shm.use = 1) as well as persistent connections. If you do not use shared-memory for the cluster tending, then you’ll have hundreds of processes that each need their tending thread to check every second on the state of the cluster. Potentially 1000 processes that just for tending have to cycle through 24 cores every second, on top of any actual KV operations, etc.
Further, understand that persistent connection means that each FPM process will have one aerospike client, and that costs at least 2 * N connections, where N is the number of nodes in the cluster.
The idea is to pay the overhead of initiating the client (connecting to the nodes, learning the partition table, etc) once and then reusing the client for many requests. How many connections is your FPM process set to take before it terminates (max_connections)? The (very) old config is to have tens or hundreds of concurrent processes that only accept a few hundreds of requests. That was tuned for the era of PHP < 5.3.3 when memory leaks forced you to kill the process before it bloated. But, PHP has been stable on memory for a long while, so the correct configuration would be to have far fewer processes, and let each of them handle a much higher number of requests. This reduces a lot of overhead, and uses the PHP client correctly.
Let me know how you have it configured now, please.
Great, thanks. The goal is to set the max_requests as high as possible, while keeping an eye on the memory consumption of the processes. The more requests that can be served, the better. You just want to ensure there is no memory leak, or at least set an acceptable size for it to grow to before it needs to be restarted.
The new Aerospike REST client mitigates this issue. Instead of using the PHP client directly in a web application context, simple local REST calls to this application can provide a more scalable and performant solution.