According to the benchmark I see on the web, Aerospike seems to be a good candidate for my use case, but I wanted to be sure before doing any tests.
I wanted to use Aerospike in order to save frames (in a binary format) coming from a capture equipment used in a telco environment. In addition to the frames, I need to store about 10 extra information (indexes) coming from the capture data itself (after decoding the frames). So the idea is to used the indexes to have a performant search on the frames without having to re-decode all of them.
The maximum performance I need is about 500000 insertion per second.
I also need to store about 7 days of frames in the database which correspond to 10TB of data.
According to the hugh amount of data I need, Iâm worry about the usage of Aerospike.
Indeed, is my server need to have 10TB of RAM to work correctly? Because when I read about how Aerospike works I notice that all the data goes to memory.
If it is not the case, how to configure Aerospike to âdirectlyâ store data in disks?
Letâs start with the fact that Aerospike is a distributed database that supports storage of data either in memory or on SSD. A namespace (think of it as similar to both a database and tablespace in the RDBMS world) is configured to have all its record stored in either one or the other (you can have multiple namespaces). Customers with large data sets tend to store the majority of their data on SSD. Purely in-memory storage is much more expensive per-GB and requires many more nodes, because of its limited data density (only so much DRAM can fit on a single server, compared to the amount of SSD storage space that can be directly attached).
Fact number two, every record has an entry in the primary-index, which is always in memory. In the Community Edition of the server the primary-index is in the RAM used by asd (the server process). In the Enterprise Edition of the server the primary-index is in shared-memory. Each one of these entries costs 64B, which allows for straight-forward capacity planning.
In order to size your cluster you need to know how many objects youâll have, and their average size. The number of objects * 64B is your baseline DRAM usage. If you know the average size of those objects and the bins involved to contain the data and metadata you described, you can use that capacity planning guide to figure out how much per-record SSD storage youâll need.
Fact three, secondary indexes, which are needed to support queries, are also in-memory and cost extra in DRAM storage. The capacity planning guide explains how to size them. There are modeling techniques to allow you to implement searching using lists and batch-reads, instead of using secondary-index queries. Still, the records in those external sets also cost in primary-index entries and data storage. Youâll have to balance size vs. speed and take into account operational considerations. Currently having a secondary index on a namespace precludes it from being able to use the EE fast restart feature.
Thanks for the answer.
According to what you say, I have one more question:
If I want to use âstandard serverâ without any SSD, is it possible to use the HDD instead, even if the performances will be poor? If yes, how to configure aerospike to made a benchmark?
SSDs are pretty standard, these days. The namespace storage configuration article shows you a recipe for how to set up a namespace that is data-in-memory with the persistence layer on a HDD. If you take out the data-in-memory true part youâll have HDD as the primary storage. This mode really should not be used, as it performs very poorly.
OK, but is there a way to âneverâ use the data-in-memory?: i.e. also save/store to HDD the index ?
I was thinking that set âdata-in-memory falseâ will do the job, but I see that memory is still used.
If you remove the data-in-memory line you will get your data saved to the HDD. What will be using memory is the primary index. As you read the links I gave earlier, you know that each record costs 64B in that in-memory index.
If you insist on your database running any slower, Iâd suggest MySQL .
I just have strong HW constraint (âstandard serverâ without SSD, 64Go of RAM and 24 cores, but lot of HDDs). So I just wondering what could be the performance with such specific HW.
Furthermore, having primary index in memory is also a problem for me regarding the amount a data I need to store (30 days of history): 64Go is not enough.
That why I ask you if there is a way to also save the primary index in HDD ?
Well, every database has its strengths. Aerospike is a distributed key-value store. Being able to access single records, and batches of records with low latency at scale, on a system with that can easily grow horizontally (more nodes) and vertically (make use of more powerful hardware) is its main strength. Placing the primary index on disk goes against that.
You hardware is actually not that constrained. It has a decent amount of memory and lots of cores per-node, so it would be good for namespaces that are data-in-memory, with a persistence layer on your HDD. You have a recipe for that config in the namespace storage configuration article I linked to before. Plenty of people use this mode.
A sizing calculation will show you how many nodes you would need for your use case.
Hi rbotzer, what if we set the storage-engine->device->file with a path on SSD rather than on HDD, considering we want to preserve rest of the SSD for other applications, is there any potential problems with such deployment?
Looking forward to your reply, thanks.
There is a performance overhead for using the file system. It is also more difficult to determine what performance can be sustained over long periods of time when using a file system because of the file systemâs cache.
Instead you could partition the SSD and have Aerospike use one of the partitions.
If any of the other application make heavy use of the SSD then they will also impact the performance of Aerospike (drive IO is a limited resource).
Thank U~
Partitioning the SSD sounds reasonable. One more question for me: what did you mean by the overhead for using the file system, is there any details or instances?
However I think the page cache of FS is always a good dynamic, partial in-memory cache, and Aerospike does not provide such cache in my knowledge. Even if, in the worst case, the whole page cache is dropped out, we just roll-back to the nearly direct device operation.
A raw partition on a UNIX system is a part of the disk where there is no file system. Although Aerospike can use UNIX files for database devices (say on file systems created on SSD0), you will still have UNIX buffering here.
Most UNIX systems use a buffer cache for disk I/O. Writes to disk are stored in the buffer and may not be written to disk immediately. If Aerospike completes a transaction and sends the result to a UNIX file, the transaction is considered complete even though the UNIX buffer cache may not have been written to disk. If the system crashes before this buffer cache is written, you lose data. In this situation, you have no way of knowing that the write to disk eventually failed. In addition, some UNIX operating systems do partial writes. In that case, if the system crashes, the underlying device may be corrupted.
As I understand, using raw partitions for Aerospike on SSD will make process more performant and also allow Aerospike to process its own I/O requests, without having to go through the UNIX buffering scheme.
It is possible at device creation stage to make sure that the use of buffered writes are disallowed by setting property for the device. I have not seen such feature in Aerospike yet.
That is effectively synchronous writes and as always will incur performance penalty akin to two-phase commit. I donât think (and please correct me if I am wrong) it prevents write buffering. According to the link provided, it states and I quote:
âWait for write to flush to disk before acknowledging the client. Only available for strong-consistency enabled namespaces.â
So there is no mention of bypassing write buffer. All it says is wait for aknolwgement that data is written to disk.
Going back to the point I raised. I was referring on the attribute of the namespace in the conf file like below
namespace somenamespace {
replication-factor 2
memory-size xG
storage-engine device {
file /ssd/aerospike/data/x.dat
filesize xG
data-in-memory true # Store data in memory in addition to file.
}
}
In addition a parameter to specify that DIRECTIO should be enabled for this namespace as I have created this device on an SSD partition.
Aerospike uses O_DIRECT (and by default O_DSYNC) for reads/writes to devices and direct-files. The commit-to-device option in strong-consistency prevents Aerospike from buffering writes (see max-write-cache).
Since you are using a file, I believe you are looking for the âdirect-filesâ option.