Let’s start with the fact that Aerospike is a distributed database that supports storage of data either in memory or on SSD. A namespace (think of it as similar to both a database and tablespace in the RDBMS world) is configured to have all its record stored in either one or the other (you can have multiple namespaces). Customers with large data sets tend to store the majority of their data on SSD. Purely in-memory storage is much more expensive per-GB and requires many more nodes, because of its limited data density (only so much DRAM can fit on a single server, compared to the amount of SSD storage space that can be directly attached).
Fact number two, every record has an entry in the primary-index, which is always in memory. In the Community Edition of the server the primary-index is in the RAM used by asd (the server process). In the Enterprise Edition of the server the primary-index is in shared-memory. Each one of these entries costs 64B, which allows for straight-forward capacity planning.
In order to size your cluster you need to know how many objects you’ll have, and their average size. The number of objects * 64B is your baseline DRAM usage. If you know the average size of those objects and the bins involved to contain the data and metadata you described, you can use that capacity planning guide to figure out how much per-record SSD storage you’ll need.
Fact three, secondary indexes, which are needed to support queries, are also in-memory and cost extra in DRAM storage. The capacity planning guide explains how to size them. There are modeling techniques to allow you to implement searching using lists and batch-reads, instead of using secondary-index queries. Still, the records in those external sets also cost in primary-index entries and data storage. You’ll have to balance size vs. speed and take into account operational considerations. Currently having a secondary index on a namespace precludes it from being able to use the EE fast restart feature.
Other things you should read (at your leisure) are about the data types supported by the server, atomic list operations, and our QuickStart program.