Incorrect documentation around Unicode

Hi folks

The comments on Unicode vs UTF-8 on http://www.aerospike.com/docs/client/java/usage/data_type.html are incorrect. Java and .NET use UTF-16 strings.

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) – Joel on Software and What Every Programmer Absolutely, Positively Needs to Know About Encodings and Character Sets to Work With Text explain the difference between Unicode and {UTF-8, UTF-16} better than I could.

The reference in the documentation:

For example, an Aerospike String is internally stored in UTF-8 format

refers to how the string is being stored in the database.

Sorry, I should have been more specific I was referring to this:

For example, an Aerospike String is internally stored in UTF-8 format. This allows Java and C# – which both use Unicode preferentially – to interact transparently with Python and Ruby (which use UTF-8) and C (which does not have a standard internal character encoding).

This doesn’t make sense, Unicode is not an alternative to UTF-8. Java and C# use UTF-16 strings internally. UTF-8 and UTF-16 both use Unicode. Unicode is a character set, UTF-8 and UTF-16 are both encodings of that character set.

Got it. Thanks. Will enhance the documentation.