Considerations while deciding single bin records vs multi bin records


#1

Synposis

In data-modeling, it is a common question whether to choose a single-bin record or multi-bin record. In the single-bin choice, all the fields are usualy stored as JSON or a BLOB and the client application has the resposibility of parsing/decoding it. Here are a few considerations which should be useful.

Secondary Indices

In general, it is advised to use separate bins for your data. One important point to note is that secondary indices are possible only if you store data in separate bins.

Storage and Performance impact

  • The storage for the two cases will be different. The details will depend on the kind of data your are planning to store in each record. This article will be helpful – Storage space difference between bins and maps
  • If you are absolutely sure that you will always be reading and writing the complete record and not parts of it, then storing it as a single bin may give you better performance. More here: http://stackoverflow.com/questions/25158114/aerospike-keep-data-as-blob-or-use-bins Note that you can use the Replace Policy while updating a multi-bin record to overwrite the entire record.
  • When a record is saved with multiple bins and a GET is performed which requests a subset of those bins, the selection of a subset of bins while fetching is done on the server (in other words – “Projection” is implemented on the server). So, the amount of data transferred between the server and client will be less when a single bin is fetched.

Limits

  • There is a limit of 32K unique bin names in use within a namespace
  • In single-bin or multi-bin case, the record size cannot exceed the write-block-size (usually 128KB for SSDs and 1MB for rotational)

How to properly plan huge storage