Most efficient way to implement 2 counters in an Aerospike bin

In one of our services having very strict SLA requirements, we use Aerospike solely as an in-memory store i.e only RAM(something like memcached/redis).

namespace <name> {
memory-size 64G # 4GB of memory to be used for index and data
replication-factor 1 # no replication
default-ttl 3600 # Writes from client that do not provide a TTL
storage-engine memory # Store data in memory only
}

For each record, we store multiple bins, and each bin here is an 8-byte integer used as a counter. The only operation we do on these bins is “write” the initial counter value and increment-by-delta where delta can be +ve or -ve integer.

Something like this - increment the counter - as_operations_add_incr(&ops, bin_name, delta). … write the initial value of counter - as_operations_add_write_int64(&ops, bin_name, initial_value)

Note that as_operations_add_write_int64 are relatively much fewer than as_operations_add_incr

Ideally, for our requirement, we don’t want integer bins to be 8 bytes(we only need 4 byte counters) but I think Aerospike doesn’t provide 4 byte integer bins?(correct me if I am wrong)

Problem:

We have a new requirement and for that, we need that each bin should have 2 counters instead of 1. Our Approach: We can implement the additional counter as an additional integer bin, but that doubles our RAM requirement.

So, we are looking for the optimal solution/data type to store additional counters that doesn’t increase our RAM requirements and is as performant(at runtime) as an integer bin is.

We think all CDT like list, map etc can be used, but we want to use the simplest and most performant data-type since our requirement is only set and increment/decrement the counters, nothing else.

Since 4 byte counters is sufficient for us, we are looking for something which is equivalent to BITFIELD of redis and we assume that Aerospike blobs provide that kind of functionality(correct me if our assumption is wrong)

So essentially each bin would store a 64-bit blob and we would treat the first 32-bits as counter1 and the remaining 32-bits as counter2 and use Aerospike’s bit api API to write/increment the counters.

Our hypotheses are :-

  1. Using 8-byte blob to maintain 2 counters) in a bin would consume the same RAM as an integer bin(which takes 8 bytes and can be used only as a single counter). AND
  2. Would be as performant(at runtime) as incrementing/decrementing/setting an integer bin is.

Right now, our team has written some code that uses the above-mentioned bit API, and facing some problems with this API, but turns out that our hypothesis 1 is incorrect.

We need your guidance considering that ideally, we don’t want to increase the RAM requirement to implement this additional counter(per bin).

Can you please help us?

Regards

Kartik

(Note to future readers, this all changes with 7.0+, also different if you are not in-memory. This reply assumes Aerospike 5.4 to 7.0.)

Integers, floats, and bools are referred to as “embedded” types in the code. This means that they use the particle pointer on the as_bin struct to store the data instead of pointing to the data.

In memory, embedded bins have 3 bytes of overhead per bin. An 8 bytes integer will occupy 11 bytes each. In memory, non-embedded bins have 12 bytes of overhead per bin and the blob particle also has a 1 byte type (to differentiate between blob types) and a 4 byte size (since blobs are variable size). So an 8 byte blob will occupy 25 bytes per bin. So two integer bins occupy 3 fewer bytes than one 8 byte blob.

Performance wise, there is an extra heap access with blobs (since they are not embedded) which may be noticeable on a benchmark.

This and more is described in the capacity planning guide.

Thanks a lot, that explains everything.

This topic was automatically closed 84 days after the last reply. New replies are no longer allowed.