Is it possible to store only key and no bins for some records?


#1

Currently I am not able to add a record without any bins(3.12 version). Any reasons why this is not supported?


#2

When all bins are deleted, record is deleted. You have to have at least one bin in a record. If you want, you can store your key in a bin. Aerospike does not store your key - rather a 20 byte hash of the set name (or null) plus your key.


#3

What would be the point of doing that? Are you asking about a way to do ‘set membership’ - not ‘set’ as it’s used in Aerospike (a schema-less table) but rather the data type.


#4

@rbotzer Something similar. Basically have a use case where we store identifiers as a persistent cache with the only query on them being exists. So, I was thinking of storing the key only.

Other approaches I thought of is duplicating the key as a default bin / storing a single record like a large list. But then I don’t think there is ‘contains’ query available for a list, right?


#5

@pgupta Yes, considered that, but then we are effectively storing double the amount of data correct? (In the hypothetical case that Aerospike would have supported storing only keys without any bins as well).


#6

The best way to do this in Aerospike is to leverage the data-in-index configuration. This is a namespace defined as storing its data in-memory, single-bin, with the additional and important fact of the data being stored in the primary index itself (as long as it’s an numeric value, integer or double).

First, this means that you’re spending no additional RAM for storage - it’s overwriting a portion of the 64B metadata entry that is always allocated per-record. This also means that the operation has a much lower latency, since it doesn’t need to read the value from RAM after the server looks up the record’s metadata in the primary index. So, checking for existence can’t get any faster. It’s also a great configuration for counters, since increment operations happen in-place. Lastly, because Enterprise Edition stores the primary index in shared memory, a data-in-index namespace will take advantage of fast restart (its data isn’t in the process memory).

For your use case you’d create the record with the given key and assign it the integer value 1. You can then use the exists method of your client to check for existence. This approach works for things like blacklists, where you also associate a TTL for the record, with the blacklisting holding until the record expires.


#7

@rbotzer Thanks for explaining. This looks like a good approach for the use case. One more question, AFAIK from aerospike docs, data in index works only at a namespace level. Is it possible to make it work at a set level by any chance? Basically asking if we can have some sets in the namespace with actual data and some with only data in index (cause i read, data in memory needs to be true for data in index).


#8

With release 3.13 and 3.14 you can add a namespace through rolling upgrade, so it’s not as painful to add one that is data-in-index. It’s not configurable for the set level.


#9

You can have up to 32 namespaces - so segregate out by namespace if needed.


#10

As Piyush pointed out, there’s a limit there, so no need to have many namespaces that are data-in-index. You can set one up, then have multiple sets for ‘set membership’ checks (up to 1023 of those per-namespace).