Does write cache on RAID controller matter for SSD?

mingfai · October 7, 2014, 11:24pm

when using spinning hard drives for database system, I believe the conventional wisdom is to use RAID controller with (battery/flash backed) write cache (BBWC/FBWC) and use write back mode.

am i correct that, for SSD, using LSI RAID controller with fast path, the best way is to turn the write cache off with “write through” mode, so basically there is no need to buy hardware with more write cache as they won’t be used at all?

Reference:

bayoukingpin · October 8, 2014, 3:20pm

mingfai,

What you are saying is correct. Keep in mind that Aerospike was designed to be quite different from the way traditional databases work. Aerospike expects tons of random data access, so caching often does not help and may introduce more issues than it is worth. This is especially true when you look at the tiny caches available on the RAID controllers vs. the overall capacity of the SSDs.

What we at Aerospike have found is that not only does the LSI FastPath help with the performance of individual SSDs (see the LSI Product Brief on the FastPath Broadcom Inc. | Connecting Everything), but it also helps with the performance of using multiple drives in parallel. This has shown to be true on many of the high end RAID controllers from Dell, IBM, and HP that use the LSI controllers.

That all being said, there are new generations of “RAID” controllers that allow for fast individual access to disks. These were generally intended for use with hadoop, but Aerospike testing has shown that in general they work well for Aerospike DB as well.

Please let us know if you do any testing and have any comparison results to report. If you can confirm what we report, then that is another independent data point. If not, we would really like to find out what happened and come to an understanding of it.

mingfai · October 8, 2014, 4:53pm

thx. I’m just getting quotation for hardware, and probably don’t have the luxury to compare or benchmark different hardware.

re. write cache, I think a better explanation is, the tiny amount of cache is for batching write requests in a queue, but SSD’s random write is so fast thus using cache doesn’t improve throughput

bayoukingpin · October 8, 2014, 5:38pm

With the hardware you get, you can try to use both caching strategies, but that may not be worth your while in time. What you say is a good explanation for this as well.

Please let us know if we can help. One thing you should be a bit careful about is that in most cases, the FastPath does require a commercial license. It is normally fairly cheap, but can take a while.

mingfai · October 9, 2014, 3:12am

btw, maybe you can consider to add a “RAID controller” section at http://www.aerospike.com/docs/operations/plan/hardware/

bayoukingpin · October 9, 2014, 11:31pm

This is a good idea. We wanted to include one before, but the problem was that we did not have a lot of tests on different RAID devices at the time. There is now more that we can identify, so we will try adding that section soon.

bbulkow · January 2, 2015, 9:11pm

mingfai,

I think the answer is a little different, although the result is the same.

Aerospike has a very effective system for batching writes (128K blocks to 1M blocks, depending on config), and you can configure caching those writes as well.

We have found some applications & customer installations where the write cache is very effective - that’s why we offer it at the database level - although the working set size can be hard to predict, so we recommend not counting on cache effect while sizing.

It will always be more effective to use the database’s mechanism for write coalescing, and for write cache management, rather than going to the storage system and having them do the work. At the database layer, we have the ability of using the transaction semantics to buffer, or not buffer, and also continually adapting to interesting device features ( like transactional memory blocks & SSDs with write capacitors ).

It is common for high performance databases to desire as much direct hardware control as possible - these are old tricks.

With Aerospike, the extra work done by the RAID layer serves no purpose - at best, it is neutral, at worse, there will be corner cases that lead to extra latency distribution and memory management and RAID card internal bus traffic. In some RAID cards, we have observed significant performance bottlenecks even in JBOD mode.

Lower level (kernel and device) cache optimizations are effective when the application can’t be re-written, when multiple applications access the same data, or when you need a reliable single-server storage system. RAID optimizations have their place - just not with Aerospike, which uses distribution for HA, and is the only process managing that data (and thus the locks and consistency).

The use of high performance databases is why RAID controller vendors have created things like a “fast path”. These effects are more noticeable at the higher throughput of Flash/SSD, because with higher speed, the overhead of these RAID cache algorithms becomes a greater percentage of the overall transaction speed. The need for “fastpath” is linked to the rise of SSDs.

Thus, this isn’t an issue about SSDs, but is made worse by SSD velocities. Direct control (like O_DIRECT and O_SYNC) are common for high performance databases - the database can cache better.

It is ironic that someone might buy an expensive RAID controller, then have to pay even more money to turn off the RAID features (fastpath).

Topic		Replies	Views
Linux RAID-1 with write-mostly vs. bcache for AWS setup with persistance and SSD caching Installation	8	3856	June 22, 2015
Are there any special settings for LSI RAID controllers? Configuration	1	3202	October 8, 2014
Can we use Aerospike with VirtualRAM (RAM + SSD(as addtionaol ram))? How Developers Are Using Aerospike	3	1618	January 13, 2017
Certification results for SSDs Installed with RAID vs. Directly attached Storage SSD Benchmarks (using ACT)	1	2663	June 26, 2015
Is my use case compatible with Aerospike usage? Use Cases	17	5752	July 14, 2019

Does write cache on RAID controller matter for SSD?

Related topics