Primary Index


#1

What is the relationship between Primary Index and Keys? Are digest store on SSD?


#2

The primary index keeps metadata about those records which are stored on the Aerospike cluster node. Each record’s metadata takes exactly 64B of DRAM in the primary index. A node only indexes its own records. What it knows about the other nodes is just the partition map - which node it needs to talk to for a given record, based on the record’s digest.

There’s some ambiguity and overlap in the way the ‘key’ concept is used. In a Key-Value store, an object (AKA row, AKA record) has a unique identifier, a key. This unique identifier allows for CRUD operations to be executed against one and only object. This is the same thing as finding the row of a specified table in an RDBMS by its primary key. Aerospike isn’t a relational database, but it is a row-oriented database, just as an RDBMS with a relational (OLTP) schema.

Specifically to Aerospike, a record (a single unique object) is identified by the 3-tuple (namespace, set, the key in your application). This is just like (database, table, PK) in an RDBMS. I’m aware that “the key in your application” is confusingly nested. You can think of it as “unique identifier in your application” instead. For example (‘user-profiles’, ‘eu-users’, ‘d2a1ab47-4e4f-4f9f-88ea-ee0e2006fad1’) can be the 3-tuple identifying a specific user object. The Aerospike client will hash (‘eu-users’, ‘d2a1ab47-4e4f-4f9f-88ea-ee0e2006fad1’) through RIPEMD-160, and use the resulting 20B digest as the actual unique identifier for the record. The record is consistently hashed to the same partition, as 12 bits of its 20B digest are used to determine the partition ID (one of 4096 partitions per-namespace).

So, given the (namespace, set, object identifier in the application) the client will get the digest, find the partition ID from it, then look up which node it needs to talk to using the partition map. The operation goes to the right node, which looks up the record’s metadata in the primary key. At this point it knows precisely where to get the object from. This can be from memory, or a specific device, block ID on that device, and byte offset within that block.


#3

Does it mean that primary index consists of Key, index metadata, and digest?


#4

The primary index always lives in memory, regardless of where the namespace stores its data. Each record has an index entry in the primary index, which costs 64B of DRAM. Of that 64B, 20B are for the digest, and the other bits and bytes are used for other metadata, such as expiration (void) time, generation, last update time, storage information (pointer, etc).


#5

Ok, the primary index which always lives in memory does not contain primary key (key). It consists of metadata and digest. Am I right?


#6

That’s right. By default Aerospike only stores the digest. There’s a write policy for all the clients which can tell the server to store the key (in your application). It gets stored with the rest of the record’s data, not in the primary index. Since the key can be large (UUIDs are 36 bytes, for example) the default behavior is not to send/store the key.


#7

Thanku rbotzer, it is very much clear that primary index does not store keys, it only stores digest. Does digest is also stored on SSD along with data?


#8

You’re welcome. Yes, the 64B in the primary index entry is also stored on SSD, along with the record data. This is needed for rebuilding the primary index on cold restart.

See the capacity planning guide for more information about storage and memory consumption.