Hi,
i would like to evaluate Aerospike for a project which employs a data model that can (theoretically) result in the creation of millions of columns (bins).
Are there fundamental reasons why the max. number of bins per namespace should/could not be increased to a value significantly larger than 32,767?
cheers,
Garry
The use of an arbitrary number of “columns” is properly handled through using Maps and Large Maps.
With Maps and Large Maps, you can create an arbitrary number of name-value pairs per row. This is the use case where you are tracking an arbitrary number of, say, advertising campaigns per user, or audience groups per user, or anything else.
Aerospike’s large maps, in particular, allow an arbitrary amount of storage for these kinds of larger use cases.
Bins are intended for the basic, predictable use of a moderate number of items. The coding of the bin system prioritizes hardware limits and efficiency, which is what contributes to the speed and predictability of Aerospike.
For arbitrary sized documents & structures, and fully arbitrary structures, please use Maps and Large Maps.
Are there any problems with using Maps and Large Maps in your intended use?
thanks for the reply.
I’m not sure if the Maps/Large Maps approach would work. I’ll need to think about that…
The application uses a data model which converts (typically sparse) row/line-oriented data to (atomic) triples (row_id - column_id - column_value) stored in one table and triples (column_id - column_value - row_id) in a transpose table.
This data model is a decent fit for Apache Accumulo or Hbase, but, on top of some of the other nice features of aerospike, it would be nice to develop with lua instead of java…
cheers,
Garry
Hi,
I had almost the same question - sorry for hijacking this post a bit
Regarding the proposed maps - I looked around for a while and found no way to update items inside a map, besides reading the whole map and writing it again. The map would contain 20-30 entries and we only want to update one entry - that does not seem to be very efficient and we might lose changes which happen in parallel. A large map would work but the the performance seems to be way worse in our case compared to simple updates for a single bin.
Is there any other way to update map entries?
Ciao,
Martin
1 Like