Does write cache on RAID controller matter for SSD?


#1

when using spinning hard drives for database system, I believe the conventional wisdom is to use RAID controller with (battery/flash backed) write cache (BBWC/FBWC) and use write back mode.

am i correct that, for SSD, using LSI RAID controller with fast path, the best way is to turn the write cache off with “write through” mode, so basically there is no need to buy hardware with more write cache as they won’t be used at all?

Reference: http://discuss.aerospike.com/t/are-there-any-special-settings-for-lsi-raid-controllers/141


#2

mingfai,

What you are saying is correct. Keep in mind that Aerospike was designed to be quite different from the way traditional databases work. Aerospike expects tons of random data access, so caching often does not help and may introduce more issues than it is worth. This is especially true when you look at the tiny caches available on the RAID controllers vs. the overall capacity of the SSDs.

What we at Aerospike have found is that not only does the LSI FastPath help with the performance of individual SSDs (see the LSI Product Brief on the FastPath http://www.lsi.com/downloads/Public/Advanced%20Software/LSI%20MegaRAID%20FastPath%20Software/LSI-PB-MR-FastPath.pdf), but it also helps with the performance of using multiple drives in parallel. This has shown to be true on many of the high end RAID controllers from Dell, IBM, and HP that use the LSI controllers.

That all being said, there are new generations of “RAID” controllers that allow for fast individual access to disks. These were generally intended for use with hadoop, but Aerospike testing has shown that in general they work well for Aerospike DB as well.

Please let us know if you do any testing and have any comparison results to report. If you can confirm what we report, then that is another independent data point. If not, we would really like to find out what happened and come to an understanding of it.


#3

thx. I’m just getting quotation for hardware, and probably don’t have the luxury to compare or benchmark different hardware.

re. write cache, I think a better explanation is, the tiny amount of cache is for batching write requests in a queue, but SSD’s random write is so fast thus using cache doesn’t improve throughput


#4

With the hardware you get, you can try to use both caching strategies, but that may not be worth your while in time. What you say is a good explanation for this as well.

Please let us know if we can help. One thing you should be a bit careful about is that in most cases, the FastPath does require a commercial license. It is normally fairly cheap, but can take a while.


#5

btw, maybe you can consider to add a “RAID controller” section at http://www.aerospike.com/docs/operations/plan/hardware/


#6

This is a good idea. We wanted to include one before, but the problem was that we did not have a lot of tests on different RAID devices at the time. There is now more that we can identify, so we will try adding that section soon.


#7

mingfai,

I think the answer is a little different, although the result is the same.

Aerospike has a very effective system for batching writes (128K blocks to 1M blocks, depending on config), and you can configure caching those writes as well.

We have found some applications & customer installations where the write cache is very effective - that’s why we offer it at the database level - although the working set size can be hard to predict, so we recommend not counting on cache effect while sizing.

It will always be more effective to use the database’s mechanism for write coalescing, and for write cache management, rather than going to the storage system and having them do the work. At the database layer, we have the ability of using the transaction semantics to buffer, or not buffer, and also continually adapting to interesting device features ( like transactional memory blocks & SSDs with write capacitors ).

It is common for high performance databases to desire as much direct hardware control as possible - these are old tricks.

With Aerospike, the extra work done by the RAID layer serves no purpose - at best, it is neutral, at worse, there will be corner cases that lead to extra latency distribution and memory management and RAID card internal bus traffic. In some RAID cards, we have observed significant performance bottlenecks even in JBOD mode.

Lower level (kernel and device) cache optimizations are effective when the application can’t be re-written, when multiple applications access the same data, or when you need a reliable single-server storage system. RAID optimizations have their place - just not with Aerospike, which uses distribution for HA, and is the only process managing that data (and thus the locks and consistency).

The use of high performance databases is why RAID controller vendors have created things like a “fast path”. These effects are more noticeable at the higher throughput of Flash/SSD, because with higher speed, the overhead of these RAID cache algorithms becomes a greater percentage of the overall transaction speed. The need for “fastpath” is linked to the rise of SSDs.

Thus, this isn’t an issue about SSDs, but is made worse by SSD velocities. Direct control (like O_DIRECT and O_SYNC) are common for high performance databases - the database can cache better.

It is ironic that someone might buy an expensive RAID controller, then have to pay even more money to turn off the RAID features (fastpath).