Uwsgi-python : blocked in query.results()


#1

Hi,

I would like to use aerospike python client within uwsgi-python (nginx).

When I try to retrieve values using a key, the query works fine.

records = client.get(“d42c62c2a7562a39-0”)
{‘user_id’: 0, ‘ssid’: 0, ‘ts’: 1452676728, ‘sid’: ‘d42c62c2a7562a39’ }

But when I try to retrieve the values using a query, this does not work.

query = client.query(“my_namespace”,
“my_set”)
query.select(‘sid’,‘ssid’)
query.where(predicates.equals(‘user_id’ , 0 ))
records = query.results()

The process seems to be blocked in query.results(). The query is working fine when executed inside aql or script python.

I do not understand what is wrong in executing a query inside uwsgi-python ?

Thanks


#2

What mode are you running it in? What do you mean by blocking?

The C client that the Python client wraps around is synchronous. A query is sent in parallel to all the nodes, multiple cores on those nodes execute the query, and each node streams the records resulting from the query back at the client. The GIL is held until that operation is done. In the case of key-value operations such as get, increment, put, etc, those are much shorter, so probably don’t pose any problem.

In general you should have one client and have all threads talk to it. See issue 52 on the aerospike/aerospike-client-python repo on GitHub.


#3

Thanks for your answer,

by blocking I mean the call to query.results() never return.

I found the problem and it was on how (or where) the aerospike python client was instanciated.

My code is executed in a uwsgi context. By default uwsgi will fork after having loaded the applications to share as much of their memory as possible. So in order to avoid blocking behavior, I need to instanciate an aerospike client in the “uwsgi application function” instead of globally.

Now I have one aerospike client per uwsgi worker. I have 20 workers, so I have 20 clients. Is it a problem for aerospike if I have several clients ?

Best regards,


#4

Typically in a webapp there should be one client shared among its worker threads. If these 20 workers are each in their own process they would need a client each, because then they can’t share one.

Just be aware that you want to limit the number of clients and funnel as many requests through them as possible. Initializing the client is a heavy operation. It connects to a seed node, learns about the other nodes from it, connects to each of those, and gets the partition table. If you destroy a client after a few hundred requests that causes a lot of churn and lots of overhead spent on connection. A client opens at least 2*N [ + 1] connections, where N is the number of nodes in the cluster. Having too many concurrent clients may cause problems with the number of file descriptors open on the server-side (for example).

However, 20 clients on a single web server should not be a problem.


#5

Thanks you,

I understand the problematic with workers and it is effectively better to use threads. My problem is that I also have connections to MySQL server and the MySQL protocol can not handle multiple threads using the same connection at once. But this is not related to aerospike :wink:

I will see if I can change the MySQL connections in order to use threads.

Thanks for your help, Best regards


#6

Preforking is a fine model, just keep your number of workers reasonable and have them handle as many requests as possible. If it works well for you blog or post about it. I’d like to see what you needed to do to get it working.


#7

We have changed some codes regarding MySQL connection for using multi-threading. Now we have one worker, 20 threads with one Aerospike connection and it work well.

Thanks for your help.

Best regards.