Bin Name -- Wondering why we have the limitations of "Maximum 14 characters of any single byte. Double-byte characters are not allowed." for bin names?

Storing Bin name with each record means a lot of storage usage just for Bin. I am wondering if there is any change now.

Have you considered leveraging a CDT (Collection Data Type) such as a Map to then store multiple key/values within a single bin and pick any name length?

It is recommended in general, for large data set to reduce the bin name lengths as much as possible due to the storage overhead.

It is recommended in general, for large data set to reduce the bin name lengths as much as possible due to the storage overhead.

I am sorry to say this. I don’t know in which world you live. In my last project I had to deal with 12000+ product data attributes covering 300 million items for a leading ecommerce company. I was thinking of stupid bin name length issue in aerospike which discouraged me from suggesting aerospike’s use. Think you have only 14 characters for the bin name and you need to name 12000+ distinct product attributes.

Kindly leave such decisions to the application developer with a guidance to keep it short but don’t put such hard limits.

DJ

I understand the inconvenience, but there is always a trade off… allowing for larger bin names will have some other trade offs, and it may be just fine for some developers and not at all for others. Your feedback is valuable, I was trying to help suggest some potential ways to work around this limitation. As explained above, the limit is not just some arbitrary limit to make developers life painful, it is due to how the data is serialized and is optimized for performance (it was actually increased to 15 bytes, but I understand that is not very helpful). Actually, thinking more about it, I think the limit may actually even be due to some performance characteristics of the C client. I’ll check further, though and will report back if anything useful.

This comes down to performance considerations. Aerospike is competitive on performance with in-memory databases, even when serving data from SSDs, by caring about things such as cache locality. Cache lines are 64 bytes in length and many things in Aerospike line up with that unit. For example entries in the primary index of an Aerospike database are 64 bytes, by design. Similarly, storage aligns with those units, which leads to some (admittedly annoying) restrictions.

A bit of discipline on the application side makes this more manageable. You can also model differently, using a Map, where the length of the map keys aren’t limited the same way as bin names are.

For some background on this topic, this is a great blog post