Python client returns strings as unicode, but I want UTF-8


#1

Hi all. I’m looking at Aerospike among other options to replace memcached. Current code is expecting UTF-8 strings, but everything is coming back from Aerospike as unicode. The docs say that Aerospike stores strings internally as UTF-8, so I don’t see why this is coming back the way it is. There is a lot of code that fails due to string formatting. Is there any client (or maybe server?) config that can change this behavior?

I also tested out setting sys.setdefaultencoding(‘UTF8’) just to see what would happen and there’s no change.

FWIW, I’m currently running Python 2.7.6 and aerospike 1.0.37

Thanks!


#2

Thanks for the feedback. We’ve been looking into it already, so that confirms the need. Hopefully this will be part of release 1.0.39.

Ronen


#3

To pinpoint it more accurately, this will be available in the upcoming 1.0.43 release. You can give str or unicode as input, both of those get UTF-8 encoded and stored on the server. The read operations will return UTF-8 encoded str. Anyone wanting unicode will have to explicitly call .decode('utf-8').