If I only use memory to store all data, need I keep each record less than write-block-size?


#1

If I don’t persist data to SSD or HDD, only in memory, can I make record larger than wirte-block-size? If so, what about the index and secondary index, still need 64 bytes for primary index?


#2

You don’t have a write-block-size if you’re all in memory. That setting goes under the storage-device section. If you have no storage device, then you have no size limitation… Yes, you still have the overhead of 64bytes per record. Additional sizing information/overhead per record and secondary index sizing is all outlined here, and applies the same to in-memory db. http://www.aerospike.com/docs/operations/plan/capacity


#3

if so, I cannot persist them into disk. But they can store in memory all the time until cold start, right?


#4

Data in memory is stored in process RAM in both Community and Enterprise Edition. If the process is stopped, you will lose the data on that node. If you have a replica in the remaining cluster, it will replicate and you will again and have a master and replica in the remaining cluster - assuming you have enough RAM in the remaining cluster. When the node joins back in, data will be replicated back into it from the cluster per the new partition map, partition by partition.

However, if you have a namespace that specifies single-bin records stored as data in memory, plus if the bin type is integer or float, then you can additionally specify data-in-index in the namespace configuration. The bytes in Primary Index that are used to point to the data location in memory are then used to store the data itself. That means, in Enterprise Edition, where PI is stored in linux shared RAM, if the process is stopped and restarted, you don’t lose the data in case of single-bin, data-in-memory, data-in-index, data type integer or float.


#5

My requirement is to store larger than 1MB record into RAM, and everyday remove all data, the next day, store new data again, so the RAM is enough and the way is like memory-cached. The data structure is normal, for example one record with multiple bin, one set has many records. I know the write block size is not applied to memory store. But is there any shortcoming or key point for this solution ?


#6

Are you planning to do scans and secondary index queries?


#7

Yes, I want to store all data in RAM, but use it like normal, multiple bins, sets, and secondary indexes, I want to jump out 1 MB limitation, I would like to know if this solution has any cons ?


#8

I think the only con is that you are dealing with larger chunks of data, and of course the cost of memory…


#9

I believe Secondary Index queries and scans results send data in 1 MB buffers. Hence I asked if you would be running those. I would explore that aspect further. May be the server allocates larger buffer on the heap instead of using pre-allocated buffers in case record size is greater than 1MB in RAM.


#10

yes, I see you concern now, thanks.


#11

And if you start moving larger chunks of data, network transfer may become a latency bottleneck. It may be self defeating.