Does Aerospike Community support larger than memory datasize

I just found out about Aerospike today when looking for a KV Store that reliably works with datasizes larger than memory/RAM available

Do have a lot of questions but the primary question is:

  1. Does Aerospike Community support larger than memory datasizes?

  2. Read that Aerospike Community edition supports that can have upto 2 namespaces per cluster - is that correct?

  3. Namespaces have to be configured at the start on whether they primarily reside in RAM or on Flash and this cannot be changed later - is that correct?

  4. If a Namespaces is configured to primarily reside in RAM, it’s, at a very high level, like Redis usecase - in that it CANNOT support larger than memory datasize - is that correct?

  5. If a Namespaces is configured to primarily on Flash, then it can support larger than memory datasize - but the keys and values themselves are always on Flash, although indices etc that point to this would be in RAM - is that correct?

  6. For Namespaces primarily on Flash, there’s no explicit code in Aerospike that caches a “hot subset” of the keys and values, that are on Flash, into RAM even if these specific keys and values are accessed very frequently. Any caching of a “hot subset” in RAM is a sideffect of the OS making that decision and the specifics will change from OS to OS - is that correct?

What is the nature of your use case? Can you explain a bit?

Responses inline below:

  • Does Aerospike Community support larger than memory datasizes?

Yes, you can store data on NVMe devices and wouldn’t be limited by the memory available. There is some limits for the total data size.

  • Read that Aerospike Community edition supports that can have upto 2 namespaces per cluster - is that correct?

I believe that is correct.

  • Namespaces have to be configured at the start on whether they primarily reside in RAM or on Flash and this cannot be changed later - is that correct?

That is not correct. You can change the storage-engine, and can do that without downtime, one node at a time, in a rolling fashion. There is actually an article for such an example: Changing storage-engine value from device to memory.

  • If a Namespaces is configured to primarily reside in RAM, it’s, at a very high level, like Redis usecase - in that it CANNOT support larger than memory datasize - is that correct?

I am not familiar with Redis. You can configure Aerospike to store data on a persistent device (and have only the index in memory – there are other options for the index as well) and still get performance similar to data in memory (sometimes pretty much the same, depending on the workload type). But if you configure an Aerospike namespace for data in memory, you would not be able to store more than the memory you have available.

  • If a Namespaces is configured to primarily on Flash, then it can support larger than memory datasize - but the keys and values themselves are always on Flash, although indices etc that point to this would be in RAM - is that correct?

That is typically correct. The index would be in memory (but the Enterprise Edition allows for even configuring the index to be on Flash or in Persistent Memory). Even when the data is on Flash, you have options to optimize reads to leverage memory as much as possible (specifically through the post-write-queue and read-page-cache

  • For Namespaces primarily on Flash, there’s no explicit code in Aerospike that caches a “hot subset” of the keys and values, that are on Flash, into RAM even if these specific keys and values are accessed very frequently. Any caching of a “hot subset” in RAM is a sideffect of the OS making that decision and the specifics will change from OS to OS - is that correct?

I guess I preemptively answered this before reading in the previous point. The read-page-cache options is what you would be looking for.