Benchmarking SSDs with Flexible IO tester(fio)


#1

Benchmarking SSDs with Flexible IO tester(fio).

Understanding the performance of server resources is important. Before benchmarking a use case or running ACT tests, the Linux command fio can be used to generate a baseline performance for SSDs independent of the database. The performance tests should be run on the SSDs for at least 6 hours preferably 24 hours to characterize the performance of the drive. Before starting the performance test the SSDs should be pre-warmed using the dd command.

dd if=/dev/zero of=/dev/xvdb bs=1M &

After pre-warming the SSD, the fio command can be run to test the drive. Below is an example of the fio command.

fio --filename=/dev/xvdf --direct=1 --rw=randwrite --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=128k --rate_iops=1280  --iodepth=16 --numjobs=1 --time_based --runtime=86400 --group_reporting –-name=benchtest

This command runs a write only workload of 128K blocks writes with a max IOPS rate of 1280 IOPS for 24 hours. The example uses a single process with an io queue depth of 16. If the SSD is not overloaded then the iodepth should be the average queue depth output from the iostat command.

The filename parameter is the drive name for the device under test. The drive in the example is /dev/xvdb. The benchmark sets the direct io option to 1 to use direct io instead of buffered.

The run duration can be set to run by time in seconds like the example or it can be set to run based on the total throughput. The size option can be used to set the throughput value

You can specify the rate in throughput (bytes per second) instead of the IOPS by using the rate option instead of using the rate_iops option.

For example:

--rate=167772160

If neither the rate nor the rate_iops option is specified then fio will run at its maximum possible throughput for the parameters selected.

Fio provides options for varying the workloads applied to the SSDs. You can do random reads, writes, or a combination of reads and writes. Sequential reads and writes can be configured. However for our purposes random reads and writes provide a better measure of performance.

First select the type of operations by setting the rw option.

--rw=randread
--rw=randrw   
--rw=randwrite

If the operation type is for a combination of random reads and writes(randrw), then the read write mix will have to be set. The mix can be set by using one of two options.

--rwmixwrite=50      Add this if randrw.  This is a 50/50 

or

--rwmixread=80      This is an 80/20 read/write.

Running different workload profiles against the drive will provide the baseline characteristics for reads, writes or a combination of reads and writes. Here are a few more examples of fio for different profiles.

50/50 Reads/Writes

fio --filename=/dev/xvdf --direct=1 --rw=randrw –rwmixwrite=50 --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=128k --rate_iops=1280  --iodepth=16 --numjobs=1 --time_based --runtime=86400 --group_reporting –-name=benchtest

95/5 Reads/Writes

fio --filename=/dev/xvdf --direct=1 --rw=randrw –rwmixwrite=5 --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=128k --rate_iops=1280  --iodepth=16 --numjobs=1 --time_based --runtime=86400 --group_reporting –-name=benchtest

Sample fio Output

After running the benchmark, FIO outputs summary statistics for the test.

Here is a sample output:

fio-2.1.3
Starting 1 process
Jobs: 1 (f=1), CR=1280/0 IOPS: [w] [8.8% done] [0KB/85888KB/0KB /s] [0/671/0 iops] [eta 01m:23s]
Jobs: 1 (f=1), CR=1280/0 IOPS: [w] [100.0% done] [0KB/85504KB/0KB /s] [0/668/0 iops] [eta 00m:00s]] [0/669/0 iops] [eta 01m:22s]
benchtest: (groupid=0, jobs=1): err= 0: pid=1687: Mon Mar 28 06:41:30 2016
write: io=7510.7MB, bw=85453KB/s, iops=667, runt= 90001msec
slat (usec): min=9, max=45, avg=13.91, stdev= 1.09
clat (usec): min=370, max=92266, avg=2954.50, stdev=13764.84
 lat (usec): min=384, max=92280, avg=2968.69, stdev=13764.82
clat percentiles (usec):
 |  1.00th=[  494],  5.00th=[  510], 10.00th=[  516], 20.00th=[  516],
 | 30.00th=[  516], 40.00th=[  516], 50.00th=[  516], 60.00th=[  516],
 | 70.00th=[  524], 80.00th=[  524], 90.00th=[  564], 95.00th=[  876],
 | 99.00th=[82432], 99.50th=[82432], 99.90th=[82432], 99.95th=[91648],
 | 99.99th=[91648]
bw (KB  /s): min=73676, max=94976, per=100.00%, avg=85551.88, stdev=2720.73
lat (usec) : 500=1.78%, 750=92.80%, 1000=0.94%
lat (msec) : 2=1.37%, 4=0.08%, 10=0.03%, 50=0.01%, 100=2.99%
cpu          : usr=2.55%, sys=0.64%, ctx=60156, majf=0, minf=27
IO depths    : 1=0.1%, 2=99.9%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
 submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 issued    : total=r=0/w=60085/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
WRITE: io=7510.7MB, aggrb=85453KB/s, minb=85453KB/s, maxb=85453KB/s, mint=90001msec, maxt=90001msec

Disk stats (read/write):
xvdb: ios=0/60081, merge=0/0, ticks=0/179076, in_queue=179148, util=99.95%

The clat (completion latency) is the most useful latency information in the output. The completion latency is the time that passes between kernel submission and the end of the IO operation. FIO provides clat percentiles as well as min, max, avg, and standard deviation. If you would like a more detailed write up of the output format you can find it here:

FIO Output

Some performance optimizations that should be considered while benchmarking the drives are partitioning the SSDs and overprovisioning the drives.

The instructions for overprovisioning SSDs can be found here:

SSD Setup

Some additional information regarding fio can be found here:

FIO WIKI

FIO HOWTO