Primary Index

The primary index keeps metadata about those records which are stored on the Aerospike cluster node. Each record’s metadata takes exactly 64B of DRAM in the primary index. A node only indexes its own records. What it knows about the other nodes is just the partition map - which node it needs to talk to for a given record, based on the record’s digest.

There’s some ambiguity and overlap in the way the ‘key’ concept is used. In a Key-Value store, an object (AKA row, AKA record) has a unique identifier, a key. This unique identifier allows for CRUD operations to be executed against one and only object. This is the same thing as finding the row of a specified table in an RDBMS by its primary key. Aerospike isn’t a relational database, but it is a row-oriented database, just as an RDBMS with a relational (OLTP) schema.

Specifically to Aerospike, a record (a single unique object) is identified by the 3-tuple (namespace, set, the key in your application). This is just like (database, table, PK) in an RDBMS. I’m aware that “the key in your application” is confusingly nested. You can think of it as “unique identifier in your application” instead. For example (‘user-profiles’, ‘eu-users’, ‘d2a1ab47-4e4f-4f9f-88ea-ee0e2006fad1’) can be the 3-tuple identifying a specific user object. The Aerospike client will hash (‘eu-users’, ‘d2a1ab47-4e4f-4f9f-88ea-ee0e2006fad1’) through RIPEMD-160, and use the resulting 20B digest as the actual unique identifier for the record. The record is consistently hashed to the same partition, as 12 bits of its 20B digest are used to determine the partition ID (one of 4096 partitions per-namespace).

So, given the (namespace, set, object identifier in the application) the client will get the digest, find the partition ID from it, then look up which node it needs to talk to using the partition map. The operation goes to the right node, which looks up the record’s metadata in the primary key. At this point it knows precisely where to get the object from. This can be from memory, or a specific device, block ID on that device, and byte offset within that block.