Primary Index

milotic91 · February 5, 2018, 12:27pm

What is the relationship between Primary Index and Keys? Are digest store on SSD?

rbotzer · February 5, 2018, 7:26pm

The primary index keeps metadata about those records which are stored on the Aerospike cluster node. Each record’s metadata takes exactly 64B of DRAM in the primary index. A node only indexes its own records. What it knows about the other nodes is just the partition map - which node it needs to talk to for a given record, based on the record’s digest.

There’s some ambiguity and overlap in the way the ‘key’ concept is used. In a Key-Value store, an object (AKA row, AKA record) has a unique identifier, a key. This unique identifier allows for CRUD operations to be executed against one and only object. This is the same thing as finding the row of a specified table in an RDBMS by its primary key. Aerospike isn’t a relational database, but it is a row-oriented database, just as an RDBMS with a relational (OLTP) schema.

Specifically to Aerospike, a record (a single unique object) is identified by the 3-tuple (namespace, set, the key in your application). This is just like (database, table, PK) in an RDBMS. I’m aware that “the key in your application” is confusingly nested. You can think of it as “unique identifier in your application” instead. For example (‘user-profiles’, ‘eu-users’, ‘d2a1ab47-4e4f-4f9f-88ea-ee0e2006fad1’) can be the 3-tuple identifying a specific user object. The Aerospike client will hash (‘eu-users’, ‘d2a1ab47-4e4f-4f9f-88ea-ee0e2006fad1’) through RIPEMD-160, and use the resulting 20B digest as the actual unique identifier for the record. The record is consistently hashed to the same partition, as 12 bits of its 20B digest are used to determine the partition ID (one of 4096 partitions per-namespace).

So, given the (namespace, set, object identifier in the application) the client will get the digest, find the partition ID from it, then look up which node it needs to talk to using the partition map. The operation goes to the right node, which looks up the record’s metadata in the primary key. At this point it knows precisely where to get the object from. This can be from memory, or a specific device, block ID on that device, and byte offset within that block.

milotic91 · February 6, 2018, 4:39am

Does it mean that primary index consists of Key, index metadata, and digest?

rbotzer · February 6, 2018, 5:05am

The primary index always lives in memory, regardless of where the namespace stores its data. Each record has an index entry in the primary index, which costs 64B of DRAM. Of that 64B, 20B are for the digest, and the other bits and bytes are used for other metadata, such as expiration (void) time, generation, last update time, storage information (pointer, etc).

milotic91 · February 6, 2018, 5:15am

Ok, the primary index which always lives in memory does not contain primary key (key). It consists of metadata and digest. Am I right?

rbotzer · February 6, 2018, 5:30am

That’s right. By default Aerospike only stores the digest. There’s a write policy for all the clients which can tell the server to store the key (in your application). It gets stored with the rest of the record’s data, not in the primary index. Since the key can be large (UUIDs are 36 bytes, for example) the default behavior is not to send/store the key.

milotic91 · February 6, 2018, 6:34am

Thanku rbotzer, it is very much clear that primary index does not store keys, it only stores digest. Does digest is also stored on SSD along with data?

rbotzer · February 6, 2018, 6:53am

You’re welcome. Yes, the 64B in the primary index entry is also stored on SSD, along with the record data. This is needed for rebuilding the primary index on cold restart.

See the capacity planning guide for more information about storage and memory consumption.

Topic		Replies	Views
Use of the primary index C# Client index	3	1103	June 18, 2020
Primary key iterator and seek functions sorely missed secondary , index	5	1788	August 29, 2017
Hybrid Memory Architecture query	5	1345	December 6, 2017
How is Aerospike primary key organized to iterate Set effectively? Data Modeling	13	4782	January 12, 2015
Does digest (which is stored in SSD) use for addressing records?	4	919	February 21, 2018

Primary Index

Related topics