• 作成:

fioによるbcacheシステムのベンチマーク, あまりbcacheに期待しないほうが良い

  • ファイルシステム: Btrfs, マウントオプションは/dev/bcache0 on / type btrfs (rw,noatime,compress=zstd,ssd,space_cache,subvolid=5,subvol=/)
  • SSD: Samsung SM961
  • HDD: TOSHIBA DT01ACA3

bcacheが正常に効いているのか確認するため, キャッシュデバイス無効有効両方でベンチマークを取ることにします.

本当はbcacheを使わない純粋なSSDでのベンチマークもするべきなのでしょうが, 普段使いのPCなので環境が構築するのが大変面倒なのでやれません.

一応公称値としてはSM961はシーケンシャル:読み出し3,100MB/s、書き込み1,400MB/sらしいです.

hdparm -Ttを使った簡易的なSSD, HDD, bcacheデバイスのベンチマークも載せておきます.

2018-01-18T02:00:11 ncaq@strawberry/pts/0(0) ~/Documents/archive/2018-01
% sudo hdparm -Tt /dev/nvme0n1p5

/dev/nvme0n1p5:
 Timing cached reads:   18616 MB in  2.00 seconds = 9316.29 MB/sec
 Timing buffered disk reads: 7206 MB in  3.00 seconds = 2401.50 MB/sec
2018-01-18T02:02:41 ncaq@strawberry/pts/0(0) ~/Documents/archive/2018-01
% sudo hdparm -Tt /dev/sda2

/dev/sda2:
 Timing cached reads:   19080 MB in  2.00 seconds = 9549.13 MB/sec
 Timing buffered disk reads: 558 MB in  3.00 seconds = 185.75 MB/sec
2018-01-18T02:02:59 ncaq@strawberry/pts/0(0) ~/Documents/archive/2018-01
% sudo hdparm -Tt /dev/bcache0

/dev/bcache0:
 Timing cached reads:   18700 MB in  2.00 seconds = 9358.02 MB/sec
 Timing buffered disk reads: 606 MB in  3.00 seconds = 201.96 MB/sec

Linuxのディスクベンチマークはfioが支配的かつ高機能のようなので, これを使ってベンチマークを取ることにしました.

fioはジョブファイルを元にベンチマークを動かします. 私は以下のようなジョブファイルを書きました.

[global]
directory=/home/ncaq/Downloads/
size=100MB

[Read-Seq]
rw=read
stonewall

[Read-Random]
rw=randread
stonewall

[Write-Seq]
rw=write
stonewall

[Write-Random]
rw=randwrite
stonewall

本来はメモリ容量の32GBをはみ出す量のファイルサイズを指定するのが望ましいのでしょうが, 実際やってみた場合15分以上かかることがわかったので, 今回は簡易的なベンチマークということで軽量に済ますことにします.

キャッシュが有効になったベンチマークテストという都合上, 2回実行します. 2回実行するのはbcacheが有効になった時だけで良いと思うのですが, 公平にするためにbcache無効時も2回実行します.

fioベンチマークが作るファイルはそれぞれのテスト開始時に消去します.

としてbcacheからキャッシュデバイスを開放します.

2018-01-17T21:23:06 ncaq@strawberry/pts/0(0) ~
% sudo zsh -c 'echo a2be69aa-d1f5-4b93-91e3-2ea54ecec596 > /sys/block/bcache0/bcache/detach'
2018-01-17T21:23:10 ncaq@strawberry/pts/0(0) ~
% sudo zsh -c 'echo 1 /sys/fs/bcache/85f27d01-75f9-4f1e-9497-7d35f0f5c9ab/stop'
1 /sys/fs/bcache/85f27d01-75f9-4f1e-9497-7d35f0f5c9ab/stop

再起動します.

fioを実行します.

1回目

Read-Seq: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Read-Random: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Seq: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Random: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.2
Starting 4 processes
Read-Seq: Laying out IO file (1 file / 100MiB)
Read-Random: Laying out IO file (1 file / 100MiB)
Write-Seq: Laying out IO file (1 file / 100MiB)
Write-Random: Laying out IO file (1 file / 100MiB)

Read-Seq: (groupid=0, jobs=1): err= 0: pid=5644: Thu Jan 18 00:13:57 2018
   read: IOPS=44.4k, BW=174MiB/s (182MB/s)(100MiB/576msec)
    clat (nsec): min=329, max=27802k, avg=22173.03, stdev=364819.28
     lat (nsec): min=360, max=27802k, avg=22210.91, stdev=364819.69
    clat percentiles (nsec):
     |  1.00th=[    342],  5.00th=[    350], 10.00th=[    350],
     | 20.00th=[    362], 30.00th=[    362], 40.00th=[    370],
     | 50.00th=[    422], 60.00th=[    502], 70.00th=[   1976],
     | 80.00th=[   2544], 90.00th=[   3568], 95.00th=[   3696],
     | 99.00th=[   4256], 99.50th=[   5664], 99.90th=[5275648],
     | 99.95th=[5341184], 99.99th=[9502720]
  lat (nsec)   : 500=59.75%, 750=2.77%, 1000=0.03%
  lat (usec)   : 2=8.78%, 4=24.54%, 10=3.73%, 20=0.01%, 250=0.01%
  lat (msec)   : 4=0.01%, 10=0.39%, 50=0.01%
  cpu          : usr=0.00%, sys=6.61%, ctx=9494, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Read-Random: (groupid=1, jobs=1): err= 0: pid=5645: Thu Jan 18 00:13:57 2018
   read: IOPS=340, BW=1360KiB/s (1393kB/s)(100MiB/75277msec)
    clat (usec): min=41, max=68349, avg=2938.93, stdev=3326.52
     lat (usec): min=41, max=68349, avg=2939.06, stdev=3326.53
    clat percentiles (usec):
     |  1.00th=[   46],  5.00th=[   50], 10.00th=[   52], 20.00th=[   55],
     | 30.00th=[   57], 40.00th=[   61], 50.00th=[ 1778], 60.00th=[ 3359],
     | 70.00th=[ 4883], 80.00th=[ 6456], 90.00th=[ 8094], 95.00th=[ 8848],
     | 99.00th=[ 9503], 99.50th=[ 9634], 99.90th=[10421], 99.95th=[16057],
     | 99.99th=[60556]
   bw (  KiB/s): min=   76, max= 4434, per=77.43%, avg=1053.10, stdev=462.58, samples=150
   iops        : min=   19, max= 1108, avg=262.89, stdev=115.63, samples=150
  lat (usec)   : 50=5.95%, 100=40.18%, 250=0.62%, 500=0.06%, 750=0.05%
  lat (usec)   : 1000=0.05%
  lat (msec)   : 2=4.47%, 4=12.87%, 10=35.56%, 20=0.16%, 50=0.02%
  lat (msec)   : 100=0.01%
  cpu          : usr=0.09%, sys=0.29%, ctx=25637, majf=0, minf=7
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Write-Seq: (groupid=2, jobs=1): err= 0: pid=5652: Thu Jan 18 00:13:57 2018
  write: IOPS=512k, BW=2000MiB/s (2097MB/s)(100MiB/50msec)
    clat (nsec): min=1319, max=18839, avg=1591.17, stdev=256.83
     lat (nsec): min=1360, max=19259, avg=1637.72, stdev=258.66
    clat percentiles (nsec):
     |  1.00th=[ 1368],  5.00th=[ 1432], 10.00th=[ 1448], 20.00th=[ 1464],
     | 30.00th=[ 1480], 40.00th=[ 1496], 50.00th=[ 1496], 60.00th=[ 1528],
     | 70.00th=[ 1624], 80.00th=[ 1752], 90.00th=[ 1832], 95.00th=[ 1928],
     | 99.00th=[ 2288], 99.50th=[ 2512], 99.90th=[ 4080], 99.95th=[ 4512],
     | 99.99th=[ 8096]
  lat (usec)   : 2=96.18%, 4=3.71%, 10=0.11%, 20=0.01%
  cpu          : usr=32.65%, sys=63.27%, ctx=2, majf=0, minf=8
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Write-Random: (groupid=3, jobs=1): err= 0: pid=5653: Thu Jan 18 00:13:57 2018
  write: IOPS=434k, BW=1695MiB/s (1777MB/s)(100MiB/59msec)
    clat (nsec): min=1440, max=20439, avg=1870.60, stdev=286.28
     lat (nsec): min=1490, max=20679, avg=1915.74, stdev=286.88
    clat percentiles (nsec):
     |  1.00th=[ 1544],  5.00th=[ 1592], 10.00th=[ 1624], 20.00th=[ 1688],
     | 30.00th=[ 1736], 40.00th=[ 1784], 50.00th=[ 1832], 60.00th=[ 1864],
     | 70.00th=[ 1928], 80.00th=[ 2008], 90.00th=[ 2160], 95.00th=[ 2256],
     | 99.00th=[ 2672], 99.50th=[ 2896], 99.90th=[ 4640], 99.95th=[ 5344],
     | 99.99th=[ 6240]
  lat (usec)   : 2=79.58%, 4=20.19%, 10=0.22%, 50=0.01%
  cpu          : usr=18.97%, sys=79.31%, ctx=1, majf=0, minf=8
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=174MiB/s (182MB/s), 174MiB/s-174MiB/s (182MB/s-182MB/s), io=100MiB (105MB), run=576-576msec

Run status group 1 (all jobs):
   READ: bw=1360KiB/s (1393kB/s), 1360KiB/s-1360KiB/s (1393kB/s-1393kB/s), io=100MiB (105MB), run=75277-75277msec

Run status group 2 (all jobs):
  WRITE: bw=2000MiB/s (2097MB/s), 2000MiB/s-2000MiB/s (2097MB/s-2097MB/s), io=100MiB (105MB), run=50-50msec

Run status group 3 (all jobs):
  WRITE: bw=1695MiB/s (1777MB/s), 1695MiB/s-1695MiB/s (1777MB/s-1777MB/s), io=100MiB (105MB), run=59-59msec

2回目

Read-Seq: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Read-Random: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Seq: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Random: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.2
Starting 4 processes

Read-Seq: (groupid=0, jobs=1): err= 0: pid=5689: Thu Jan 18 00:15:46 2018
   read: IOPS=44.0k, BW=176MiB/s (184MB/s)(100MiB/569msec)
    clat (nsec): min=320, max=22764k, avg=21709.63, stdev=337119.57
     lat (nsec): min=350, max=22764k, avg=21747.85, stdev=337120.98
    clat percentiles (nsec):
     |  1.00th=[    350],  5.00th=[    370], 10.00th=[    390],
     | 20.00th=[    524], 30.00th=[   1752], 40.00th=[   1784],
     | 50.00th=[   1816], 60.00th=[   1864], 70.00th=[   2192],
     | 80.00th=[   2512], 90.00th=[   2928], 95.00th=[   3344],
     | 99.00th=[   4016], 99.50th=[   6048], 99.90th=[5079040],
     | 99.95th=[5210112], 99.99th=[8454144]
  lat (nsec)   : 500=19.20%, 750=1.74%, 1000=0.06%
  lat (usec)   : 2=46.78%, 4=31.15%, 10=0.66%, 20=0.01%, 750=0.01%
  lat (msec)   : 4=0.01%, 10=0.39%, 20=0.01%, 50=0.01%
  cpu          : usr=1.41%, sys=7.57%, ctx=20129, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Read-Random: (groupid=1, jobs=1): err= 0: pid=5690: Thu Jan 18 00:15:46 2018
   read: IOPS=333, BW=1334KiB/s (1366kB/s)(100MiB/76758msec)
    clat (usec): min=49, max=74409, avg=2995.94, stdev=3265.48
     lat (usec): min=49, max=74410, avg=2996.14, stdev=3265.49
    clat percentiles (usec):
     |  1.00th=[   56],  5.00th=[   64], 10.00th=[   71], 20.00th=[   86],
     | 30.00th=[  221], 40.00th=[  255], 50.00th=[ 1827], 60.00th=[ 3392],
     | 70.00th=[ 4883], 80.00th=[ 6521], 90.00th=[ 8094], 95.00th=[ 8848],
     | 99.00th=[ 9503], 99.50th=[ 9634], 99.90th=[10159], 99.95th=[10421],
     | 99.99th=[42730]
   bw (  KiB/s): min=  752, max= 4468, per=83.07%, avg=1108.12, stdev=358.36, samples=153
   iops        : min=  188, max= 1117, avg=276.62, stdev=89.61, samples=153
  lat (usec)   : 50=0.04%, 100=23.15%, 250=16.02%, 500=7.59%, 750=0.07%
  lat (usec)   : 1000=0.04%
  lat (msec)   : 2=4.26%, 4=12.98%, 10=35.73%, 20=0.11%, 50=0.01%
  lat (msec)   : 100=0.01%
  cpu          : usr=0.21%, sys=0.38%, ctx=25713, majf=0, minf=7
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Write-Seq: (groupid=2, jobs=1): err= 0: pid=5691: Thu Jan 18 00:15:46 2018
  write: IOPS=474k, BW=1852MiB/s (1942MB/s)(100MiB/54msec)
    clat (nsec): min=1340, max=57846, avg=1554.51, stdev=420.67
     lat (nsec): min=1390, max=57886, avg=1597.23, stdev=420.87
    clat percentiles (nsec):
     |  1.00th=[ 1384],  5.00th=[ 1400], 10.00th=[ 1416], 20.00th=[ 1432],
     | 30.00th=[ 1448], 40.00th=[ 1448], 50.00th=[ 1464], 60.00th=[ 1480],
     | 70.00th=[ 1544], 80.00th=[ 1688], 90.00th=[ 1784], 95.00th=[ 1928],
     | 99.00th=[ 2256], 99.50th=[ 2448], 99.90th=[ 3952], 99.95th=[ 4192],
     | 99.99th=[ 8768]
  lat (usec)   : 2=96.34%, 4=3.57%, 10=0.08%, 20=0.01%, 100=0.01%
  cpu          : usr=11.32%, sys=84.91%, ctx=1, majf=0, minf=8
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Write-Random: (groupid=3, jobs=1): err= 0: pid=5692: Thu Jan 18 00:15:46 2018
  write: IOPS=376k, BW=1471MiB/s (1542MB/s)(100MiB/68msec)
    clat (nsec): min=1479, max=72186, avg=1987.54, stdev=538.63
     lat (nsec): min=1529, max=72236, avg=2034.62, stdev=541.44
    clat percentiles (nsec):
     |  1.00th=[ 1608],  5.00th=[ 1688], 10.00th=[ 1736], 20.00th=[ 1784],
     | 30.00th=[ 1832], 40.00th=[ 1864], 50.00th=[ 1912], 60.00th=[ 1976],
     | 70.00th=[ 2040], 80.00th=[ 2160], 90.00th=[ 2288], 95.00th=[ 2480],
     | 99.00th=[ 3024], 99.50th=[ 3344], 99.90th=[ 5024], 99.95th=[ 5664],
     | 99.99th=[ 9664]
  lat (usec)   : 2=64.19%, 4=35.55%, 10=0.25%, 20=0.01%, 100=0.01%
  cpu          : usr=20.90%, sys=77.61%, ctx=2, majf=0, minf=8
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=176MiB/s (184MB/s), 176MiB/s-176MiB/s (184MB/s-184MB/s), io=100MiB (105MB), run=569-569msec

Run status group 1 (all jobs):
   READ: bw=1334KiB/s (1366kB/s), 1334KiB/s-1334KiB/s (1366kB/s-1366kB/s), io=100MiB (105MB), run=76758-76758msec

Run status group 2 (all jobs):
  WRITE: bw=1852MiB/s (1942MB/s), 1852MiB/s-1852MiB/s (1942MB/s-1942MB/s), io=100MiB (105MB), run=54-54msec

Run status group 3 (all jobs):
  WRITE: bw=1471MiB/s (1542MB/s), 1471MiB/s-1471MiB/s (1542MB/s-1542MB/s), io=100MiB (105MB), run=68-68msec

bcacheを有効化し直します.

sudo zsh -c 'echo 85f27d01-75f9-4f1e-9497-7d35f0f5c9ab > /sys/block/bcache0/bcache/attach'

ちなみに, 有効にしてからの初回起動はキャッシュヒット率が4%と極めて低く, 動作は遅かったままでした.

1回目

Read-Seq: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Read-Random: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Seq: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Random: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.2
Starting 4 processes
Read-Seq: Laying out IO file (1 file / 100MiB)
Read-Random: Laying out IO file (1 file / 100MiB)
Write-Seq: Laying out IO file (1 file / 100MiB)
Write-Random: Laying out IO file (1 file / 100MiB)

Read-Seq: (groupid=0, jobs=1): err= 0: pid=5919: Thu Jan 18 00:25:22 2018
   read: IOPS=48.4k, BW=189MiB/s (198MB/s)(100MiB/529msec)
    clat (nsec): min=330, max=9648.3k, avg=20331.72, stdev=308917.80
     lat (nsec): min=369, max=9648.3k, avg=20369.58, stdev=308917.99
    clat percentiles (nsec):
     |  1.00th=[    342],  5.00th=[    342], 10.00th=[    342],
     | 20.00th=[    350], 30.00th=[    362], 40.00th=[    370],
     | 50.00th=[    422], 60.00th=[   1880], 70.00th=[   1992],
     | 80.00th=[   2192], 90.00th=[   3024], 95.00th=[   3664],
     | 99.00th=[   4256], 99.50th=[   6112], 99.90th=[5210112],
     | 99.95th=[5341184], 99.99th=[5406720]
  lat (nsec)   : 500=54.89%, 750=1.60%, 1000=0.02%
  lat (usec)   : 2=13.96%, 4=26.95%, 10=2.17%, 20=0.01%, 250=0.01%
  lat (usec)   : 500=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.38%
  cpu          : usr=2.46%, sys=4.73%, ctx=11034, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Read-Random: (groupid=1, jobs=1): err= 0: pid=5920: Thu Jan 18 00:25:22 2018
   read: IOPS=327, BW=1309KiB/s (1341kB/s)(100MiB/78220msec)
    clat (usec): min=50, max=27403, avg=3053.88, stdev=3402.73
     lat (usec): min=50, max=27403, avg=3054.01, stdev=3402.74
    clat percentiles (usec):
     |  1.00th=[   58],  5.00th=[   64], 10.00th=[   68], 20.00th=[   72],
     | 30.00th=[   76], 40.00th=[   83], 50.00th=[ 1909], 60.00th=[ 3523],
     | 70.00th=[ 5014], 80.00th=[ 6652], 90.00th=[ 8225], 95.00th=[ 8979],
     | 99.00th=[ 9765], 99.50th=[10159], 99.90th=[20579], 99.95th=[24511],
     | 99.99th=[26870]
   bw (  KiB/s): min=   63, max= 4027, per=79.33%, avg=1038.46, stdev=436.57, samples=156
   iops        : min=   15, max= 1006, avg=259.27, stdev=109.12, samples=156
  lat (usec)   : 100=45.19%, 250=1.37%, 500=0.16%, 750=0.03%, 1000=0.06%
  lat (msec)   : 2=3.74%, 4=12.61%, 10=36.24%, 20=0.49%, 50=0.12%
  cpu          : usr=0.13%, sys=0.48%, ctx=28693, majf=0, minf=8
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Write-Seq: (groupid=2, jobs=1): err= 0: pid=5930: Thu Jan 18 00:25:22 2018
  write: IOPS=522k, BW=2041MiB/s (2140MB/s)(100MiB/49msec)
    clat (nsec): min=1329, max=21979, avg=1578.04, stdev=283.14
     lat (nsec): min=1369, max=22409, avg=1624.35, stdev=284.34
    clat percentiles (nsec):
     |  1.00th=[ 1352],  5.00th=[ 1416], 10.00th=[ 1432], 20.00th=[ 1448],
     | 30.00th=[ 1464], 40.00th=[ 1480], 50.00th=[ 1496], 60.00th=[ 1512],
     | 70.00th=[ 1576], 80.00th=[ 1704], 90.00th=[ 1816], 95.00th=[ 1976],
     | 99.00th=[ 2320], 99.50th=[ 2576], 99.90th=[ 4384], 99.95th=[ 5216],
     | 99.99th=[ 8096]
  lat (usec)   : 2=95.39%, 4=4.48%, 10=0.12%, 20=0.01%, 50=0.01%
  cpu          : usr=16.67%, sys=81.25%, ctx=4, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Write-Random: (groupid=3, jobs=1): err= 0: pid=5931: Thu Jan 18 00:25:22 2018
  write: IOPS=449k, BW=1754MiB/s (1840MB/s)(100MiB/57msec)
    clat (nsec): min=1430, max=19619, avg=1828.22, stdev=289.68
     lat (nsec): min=1470, max=19819, avg=1877.15, stdev=292.75
    clat percentiles (nsec):
     |  1.00th=[ 1528],  5.00th=[ 1576], 10.00th=[ 1608], 20.00th=[ 1656],
     | 30.00th=[ 1688], 40.00th=[ 1736], 50.00th=[ 1768], 60.00th=[ 1800],
     | 70.00th=[ 1864], 80.00th=[ 1976], 90.00th=[ 2128], 95.00th=[ 2256],
     | 99.00th=[ 2640], 99.50th=[ 2928], 99.90th=[ 4768], 99.95th=[ 5408],
     | 99.99th=[ 6240]
  lat (usec)   : 2=81.76%, 4=18.05%, 10=0.19%, 20=0.01%
  cpu          : usr=21.43%, sys=76.79%, ctx=1, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=189MiB/s (198MB/s), 189MiB/s-189MiB/s (198MB/s-198MB/s), io=100MiB (105MB), run=529-529msec

Run status group 1 (all jobs):
   READ: bw=1309KiB/s (1341kB/s), 1309KiB/s-1309KiB/s (1341kB/s-1341kB/s), io=100MiB (105MB), run=78220-78220msec

Run status group 2 (all jobs):
  WRITE: bw=2041MiB/s (2140MB/s), 2041MiB/s-2041MiB/s (2140MB/s-2140MB/s), io=100MiB (105MB), run=49-49msec

Run status group 3 (all jobs):
  WRITE: bw=1754MiB/s (1840MB/s), 1754MiB/s-1754MiB/s (1840MB/s-1840MB/s), io=100MiB (105MB), run=57-57msec

2回目

Read-Seq: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Read-Random: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Seq: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
Write-Random: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.2
Starting 4 processes

Read-Seq: (groupid=0, jobs=1): err= 0: pid=5967: Thu Jan 18 00:25:31 2018
   read: IOPS=48.4k, BW=189MiB/s (198MB/s)(100MiB/529msec)
    clat (nsec): min=300, max=10336k, avg=20169.27, stdev=300075.95
     lat (nsec): min=330, max=10336k, avg=20205.15, stdev=300076.09
    clat percentiles (nsec):
     |  1.00th=[    322],  5.00th=[    362], 10.00th=[    370],
     | 20.00th=[    390], 30.00th=[    470], 40.00th=[   1624],
     | 50.00th=[   1752], 60.00th=[   2448], 70.00th=[   2576],
     | 80.00th=[   3280], 90.00th=[   3376], 95.00th=[   3856],
     | 99.00th=[   4448], 99.50th=[   6048], 99.90th=[5079040],
     | 99.95th=[5210112], 99.99th=[5275648]
  lat (nsec)   : 500=31.52%, 750=5.82%, 1000=0.45%
  lat (usec)   : 2=16.03%, 4=43.84%, 10=1.94%, 20=0.01%, 250=0.01%
  lat (usec)   : 750=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.37%, 20=0.01%
  cpu          : usr=0.57%, sys=9.09%, ctx=15834, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Read-Random: (groupid=1, jobs=1): err= 0: pid=5968: Thu Jan 18 00:25:31 2018
   read: IOPS=38.2k, BW=149MiB/s (156MB/s)(100MiB/671msec)
    clat (usec): min=17, max=8498, avg=25.49, stdev=93.02
     lat (usec): min=17, max=8498, avg=25.54, stdev=93.02
    clat percentiles (usec):
     |  1.00th=[   18],  5.00th=[   18], 10.00th=[   19], 20.00th=[   19],
     | 30.00th=[   19], 40.00th=[   19], 50.00th=[   19], 60.00th=[   20],
     | 70.00th=[   20], 80.00th=[   20], 90.00th=[   23], 95.00th=[   64],
     | 99.00th=[  104], 99.50th=[  115], 99.90th=[  149], 99.95th=[  161],
     | 99.99th=[ 6587]
   bw (  KiB/s): min=152182, max=152182, per=99.72%, avg=152182.00, stdev= 0.00, samples=1
   iops        : min=38045, max=38045, avg=38045.00, stdev= 0.00, samples=1
  lat (usec)   : 20=81.70%, 50=8.73%, 100=8.52%, 250=1.02%
  lat (msec)   : 2=0.01%, 10=0.02%
  cpu          : usr=0.15%, sys=25.52%, ctx=25602, majf=0, minf=7
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=25600,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Write-Seq: (groupid=2, jobs=1): err= 0: pid=5969: Thu Jan 18 00:25:31 2018
  write: IOPS=69.8k, BW=272MiB/s (286MB/s)(100MiB/367msec)
    clat (nsec): min=1220, max=310554k, avg=13629.31, stdev=1940951.49
     lat (nsec): min=1269, max=310555k, avg=13673.71, stdev=1940956.03
    clat percentiles (nsec):
     |  1.00th=[ 1256],  5.00th=[ 1288], 10.00th=[ 1288], 20.00th=[ 1304],
     | 30.00th=[ 1336], 40.00th=[ 1368], 50.00th=[ 1432], 60.00th=[ 1448],
     | 70.00th=[ 1496], 80.00th=[ 1656], 90.00th=[ 1896], 95.00th=[ 2064],
     | 99.00th=[ 2192], 99.50th=[ 2320], 99.90th=[ 3696], 99.95th=[ 4128],
     | 99.99th=[ 7328]
  lat (usec)   : 2=92.11%, 4=7.82%, 10=0.06%
  lat (msec)   : 500=0.01%
  cpu          : usr=1.91%, sys=12.84%, ctx=6, majf=0, minf=8
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Write-Random: (groupid=3, jobs=1): err= 0: pid=5970: Thu Jan 18 00:25:31 2018
  write: IOPS=63.4k, BW=248MiB/s (260MB/s)(100MiB/404msec)
    clat (nsec): min=1330, max=330047k, avg=14920.07, stdev=2062778.23
     lat (nsec): min=1380, max=330047k, avg=14965.43, stdev=2062780.01
    clat percentiles (nsec):
     |  1.00th=[ 1464],  5.00th=[ 1560], 10.00th=[ 1656], 20.00th=[ 1752],
     | 30.00th=[ 1816], 40.00th=[ 1880], 50.00th=[ 1944], 60.00th=[ 2008],
     | 70.00th=[ 2096], 80.00th=[ 2224], 90.00th=[ 2416], 95.00th=[ 2576],
     | 99.00th=[ 5280], 99.50th=[ 6176], 99.90th=[ 6688], 99.95th=[ 7008],
     | 99.99th=[ 9280]
  lat (usec)   : 2=59.45%, 4=39.20%, 10=1.35%, 20=0.01%
  lat (msec)   : 500=0.01%
  cpu          : usr=3.23%, sys=14.39%, ctx=6, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,25600,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=189MiB/s (198MB/s), 189MiB/s-189MiB/s (198MB/s-198MB/s), io=100MiB (105MB), run=529-529msec

Run status group 1 (all jobs):
   READ: bw=149MiB/s (156MB/s), 149MiB/s-149MiB/s (156MB/s-156MB/s), io=100MiB (105MB), run=671-671msec

Run status group 2 (all jobs):
  WRITE: bw=272MiB/s (286MB/s), 272MiB/s-272MiB/s (286MB/s-286MB/s), io=100MiB (105MB), run=367-367msec

Run status group 3 (all jobs):
  WRITE: bw=248MiB/s (260MB/s), 248MiB/s-248MiB/s (260MB/s-260MB/s), io=100MiB (105MB), run=404-404msec

それぞれのbwをテーブルにまとめました.

case Read-Seq Read-Random Write-Seq Write-Random
disable-1 174MiB 1360KiB 2000MiB 1695MiB
disable-2 176MiB 1334KiB 1852MiB 1471MiB
enable-1 189MiB 1309KiB 2041MiB 1754MiB
enable-2 189MiB 149MiB 272MiB 248MiB

enable-2のみRead-Randomの単位がKiBではなくMiBになっているのが注目するべきポイントですね. 他のテストはRead-Randomが異様に遅いため, これのせいでかなり時間がかかっていましたが, enable-2はキャッシュが効いているのか即座に終わるようになりました.

しかしWriteテストは逆に性能が落ち込んでいます. Read-Randomが単位レベルで高速化されてるので, それを無視するレベルで早くなっているのですが.

謎に思ったので数回fioを実行してみたのですが, Readのテスト値は安定しているのですが, Writeのテスト値は10倍レベルで変動して, 全く信用できないことがわかりました.

既存のテストファイルを削除してからfioを実行し直してみるとRead-Randomの値はKiBになってしまいますが, Writeの値が1961MiB, 1754MiBと再び高速化しました.

Btrfsの透過圧縮が関係しているのでしょうか? ファイルの内容を乱数か何かで決めたとすると, 内容によって値が変わってくるということでしょうか.

disable-2でもWriteの値が大幅低下していたらならば, Btrfsが上書きに異様に弱いという推定が出来たのですが, これは一体どういうことでしょうか. bcacheのwritebackの性能があまり良くないということなのでしょうか.

Testing The BCache SSD Cache For HDDs On Linux 4.8 - Phoronix を見てもそこまで極端にWriteに弱いという結果は出ていないので謎です.

fioがあまり信用できないので他のツールを探したのですが, Btrfsの圧縮機能に対して実際の値を出してくれるベンチマークツールはありませんでした.

実感としてbcacheを無効化するとアプリケーションの起動速度が目に見えて遅くなるので, bcacheの実際の効果はあるのでしょうけれど, fioベンチマークではあまり振るわない結果となりました.

実アプリケーションに対してそこそこ機能するとしても, やはりSSDを直接使うのには遠く及ばないパフォーマンスのようですね. ファイルを置く場所に悩まなくなるのはとても良いことなのですが, 何もデメリットのない夢の解決策というわけではないようです.