walter

hardware

2 disks are used for the Linux software raid1: 2x Seagate ST3600057SS Cheetah 15K.7, 600GB, SAS 6Gb/s, 15000RPM.

[8:0:0:0]    disk    SEAGATE  ST373455SS       0004  /dev/sda 
[8:0:1:0]    disk    SEAGATE  ST373455SS       0004  /dev/sdb 
[8:0:2:0]    disk    SEAGATE  ST3600057SS      0008  /dev/sdc 
[8:0:3:0]    disk    SEAGATE  ST3600057SS      000B  /dev/sdd 

They are set in a passthrough mode on the onboard LSI SAS1068E PCI-Express card (which is a piece of crap if you really want to know my opinion).

walter:~# smartctl -i /dev/sdc
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

Vendor:               SEAGATE 
Product:              ST3600057SS     
Revision:             0008
User Capacity:        600,127,266,816 bytes [600 GB]
Logical block size:   512 bytes
Logical Unit id:      0x5000c500684a6b1b
Serial number:        6SL6CB0Z0000B3450WS9
Device type:          disk
Transport protocol:   SAS
Local Time is:        Fri Nov 22 13:24:37 2013 EST
Device supports SMART and is Enabled
Temperature Warning Enabled

walter:~# smartctl -i /dev/sdd
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

Vendor:               SEAGATE 
Product:              ST3600057SS     
Revision:             000B
User Capacity:        600,127,266,816 bytes [600 GB]
Logical block size:   512 bytes
Logical Unit id:      0x5000c5003ca55d23
Serial number:        6SL5ZPNT0000N33410SL
Device type:          disk
Transport protocol:   SAS
Local Time is:        Fri Nov 22 13:25:09 2013 EST
Device supports SMART and is Enabled
Temperature Warning Enabled

hdparm

walter:~# for i in $(seq 1 5); do hdparm -tT /dev/sdc; done

/dev/sdc:
 Timing cached reads:   12234 MB in  2.00 seconds = 6123.88 MB/sec
 Timing buffered disk reads: 586 MB in  3.00 seconds = 195.12 MB/sec

/dev/sdc:
 Timing cached reads:   13474 MB in  2.00 seconds = 6745.69 MB/sec
 Timing buffered disk reads: 586 MB in  3.01 seconds = 194.90 MB/sec

/dev/sdc:
 Timing cached reads:   13370 MB in  2.00 seconds = 6694.24 MB/sec
 Timing buffered disk reads: 586 MB in  3.00 seconds = 195.26 MB/sec

/dev/sdc:
 Timing cached reads:   13232 MB in  2.00 seconds = 6624.72 MB/sec
 Timing buffered disk reads: 586 MB in  3.00 seconds = 195.23 MB/sec

/dev/sdc:
 Timing cached reads:   11874 MB in  2.00 seconds = 5944.10 MB/sec
 Timing buffered disk reads: 586 MB in  3.00 seconds = 195.02 MB/sec

walter:~# for i in $(seq 1 5); do hdparm -tT /dev/sdd; done

/dev/sdd:
 Timing cached reads:   13092 MB in  2.00 seconds = 6554.64 MB/sec
 Timing buffered disk reads: 592 MB in  3.01 seconds = 196.76 MB/sec

/dev/sdd:
 Timing cached reads:   13438 MB in  2.00 seconds = 6727.67 MB/sec
 Timing buffered disk reads: 592 MB in  3.01 seconds = 196.81 MB/sec

/dev/sdd:
 Timing cached reads:   12336 MB in  2.00 seconds = 6175.49 MB/sec
 Timing buffered disk reads: 590 MB in  3.00 seconds = 196.65 MB/sec

/dev/sdd:
 Timing cached reads:   13220 MB in  2.00 seconds = 6618.50 MB/sec
 Timing buffered disk reads: 592 MB in  3.01 seconds = 196.74 MB/sec

/dev/sdd:
 Timing cached reads:   13308 MB in  2.00 seconds = 6662.68 MB/sec
 Timing buffered disk reads: 590 MB in  3.00 seconds = 196.54 MB/sec

Linux Software RAID1

Construct the mirror from the two partitions sdc1 and sdd1:

Disk /dev/sdc: 600.1 GB, 600127266816 bytes
81 heads, 63 sectors/track, 229693 cylinders, total 1172123568 sectors
   Device Boot      Start         End      Blocks   Id  System

/dev/sdc1            2048  1172123567   586060760   fd  Linux raid autodetect

Disk /dev/sdd: 600.1 GB, 600127266816 bytes
255 heads, 63 sectors/track, 72961 cylinders, total 1172123568 sectors

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1            2048  1172123567   586060760   fd  Linux raid autodetect
mdadm --create --verbose /dev/md2 --level 1 --raid-devices=2 /dev/sdc1 /dev/sdd1

Throughput:

TEST_DEVICE=/dev/md2

for i in $(seq 10); do
  dd if=/dev/zero of=$TEST_DEVICE bs=512M count=1 oflag=direct
done
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.93126 s, 183 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.95495 s, 182 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.96159 s, 181 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.9579 s, 182 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.94575 s, 182 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.93007 s, 183 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.92994 s, 183 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.92594 s, 183 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.94176 s, 183 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 2.94557 s, 182 MB/s

Latency

TEST_DEVICE=/dev/md2
for i in `seq 1 10`; do
    dd if=/dev/zero of=$TEST_DEVICE bs=512 count=10000 oflag=direct
done

10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.675977 s, 7.6 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.770088 s, 6.6 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.731074 s, 7.0 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.798559 s, 6.4 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.746383 s, 6.9 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.776625 s, 6.6 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.74706 s, 6.9 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.744349 s, 6.9 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.80117 s, 6.4 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.782879 s, 6.5 MB/s

From the DRBD doc, verbatim: It is important to understand that throughput measurements generated by dd are completely irrelevant for this test; what is important is the time elapsed during the completion of said 1,000 writes. Dividing this time by 1,000 gives the average latency of a single sector write.


Filesystem

Tests for 2 XFS filesystems, one taking the default builtin values, and the other specifying that the device is actually built out of 2 disks.

Bonnie++ stats

mkfs.xfs -f /dev/md2

walter:~/bonnie# bonnie++ -d /0 -s 320G -u malin:pet -n 0 -f -b
Using uid:1120, gid:200.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...

Version      1.96   ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
walter         320G           184927  17 84987  10           192036  12 259.0  24
Latency                         146ms     163ms             20511us     125ms

Another run, but with file access testing. From the manpage, about the option -f size-for-char-io:

-f size-for-char-io              
     fast mode control, skips per-char IO tests if no parameter, otherwise specifies the size of the tests for per-char IO tests (default 20M).
walter:~# bonnie++ -d /0 -s 70G -u malin:pet -n 10 -b

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
walter          70G  1392  98 194179  17 89058  10  2653  97 203437  13 378.9  29
Latency              5969us   14743us     274ms   10531us   33523us     118ms
Version  1.96       ------Sequential Create------ --------Random Create--------
walter              -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 10   228   2 +++++ +++   228   2   228   2 +++++ +++   227   2
Latency             48159us     154us   40058us   35547us      10us   77927us

mkfs.xfs -d su=256k,sw=1 -l version=2,su=256k /dev/md2

walter:~# bonnie++ -d /0 -s 320G -u malin:pet -n 0 -f -b

Version      1.96   ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
walter         320G           185656  16 85433  10           197649  13 252.9  26
Latency                       40940us     262ms             20822us     117ms
walter:~# bonnie++ -d /0 -s 70G -u malin:pet -n 0 -f -b

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
walter          70G           193336  17 89251  10           201082  12 488.0   9
Latency                       14732us     170ms             13307us     150ms

curtis

hardware

The system disks are old Prolient HP GB0750C4414 750GB Hot-Plug SATA 1.5GB/s Hard Drive 7200 RPM 3.5 inch, with a transfer rate of 150MB/s. Old stuff.
The 2 disks that will compose the RAID1 are Seagate Barracuda 1000GB / 1TB SATA Hard Disks ST31000340NS:

curtis:~# lsscsi
[6:0:1:0]    cd/dvd  TEAC     DV-28E-V         1.AB  /dev/sr0 
[8:0:0:0]    disk    ATA      GB0750C4414      HPG3  /dev/sda 
[8:0:1:0]    disk    ATA      GB0750C4414      HPG3  /dev/sdb 
[8:0:2:0]    disk    ATA      ST31000340NS     SN04  /dev/sdc 
[8:0:3:0]    disk    ATA      ST31000340NS     SN05  /dev/sdd 

curtis:~# smartctl -i /dev/sdc
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===

Model Family:     Seagate Barracuda ES.2
Device Model:     ST31000340NS
Serial Number:    3QJ09VAA
LU WWN Device Id: 5 000c50 00a106b89
Firmware Version: SN04
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 2
Local Time is:    Fri Nov 22 13:10:40 2013 EST

==> WARNING: There are known problems with these drives,
see the following Seagate web pages:
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207963

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

curtis:~# smartctl -i /dev/sdd
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda ES.2
Device Model:     ST31000340NS
Serial Number:    9QJ1L7BW
LU WWN Device Id: 5 000c50 00d902450
Firmware Version: SN05
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Fri Nov 22 13:19:29 2013 EST

==> WARNING: There are known problems with these drives,
see the following Seagate web pages:
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207963

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

The disk with SN 3QJ09VAA (sdc above) is a refurbished one.

hdparm

curtis:~# for i in $(seq 1 5); do hdparm -tT /dev/sdc; done

/dev/sdc:
 Timing cached reads:   11800 MB in  2.00 seconds = 5906.57 MB/sec
 Timing buffered disk reads: 318 MB in  3.02 seconds = 105.14 MB/sec

/dev/sdc:
 Timing cached reads:   11758 MB in  2.00 seconds = 5885.43 MB/sec
 Timing buffered disk reads: 318 MB in  3.01 seconds = 105.60 MB/sec

/dev/sdc:
 Timing cached reads:   13196 MB in  2.00 seconds = 6606.76 MB/sec
 Timing buffered disk reads: 318 MB in  3.01 seconds = 105.48 MB/sec

/dev/sdc:
 Timing cached reads:   12500 MB in  2.00 seconds = 6257.93 MB/sec
 Timing buffered disk reads: 316 MB in  3.02 seconds = 104.76 MB/sec

/dev/sdc:
 Timing cached reads:   12548 MB in  2.00 seconds = 6281.66 MB/sec
 Timing buffered disk reads: 324 MB in  3.01 seconds = 107.74 MB/sec

Doing the same for sdd makes the system barf. Either the disk is bad or the backplane is misbehaving. I’m restarting the system without the Seagates and will test them one at a time. One is a refurbished disk so…

After some tests, turns out that disk SN 9QJ1L7BW is bad. Swapped for a new Seagate Barracuda ES.2:

curtis:~# smartctl -i /dev/sdd
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda ES.2
Device Model:     ST31000340NS
Serial Number:    9QJ2RJSV
LU WWN Device Id: 5 000c50 03cde3cd1
Firmware Version: SN06
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Fri Nov 22 14:19:39 2013 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
curtis:~# for i in $(seq 1 5); do hdparm -tT /dev/sdd; done

/dev/sdd:
 Timing cached reads:   11764 MB in  2.00 seconds = 5888.81 MB/sec
 Timing buffered disk reads: 308 MB in  3.00 seconds = 102.66 MB/sec

/dev/sdd:
 Timing cached reads:   12594 MB in  2.00 seconds = 6304.25 MB/sec
 Timing buffered disk reads: 310 MB in  3.01 seconds = 103.15 MB/sec

/dev/sdd:
 Timing cached reads:   12634 MB in  2.00 seconds = 6325.03 MB/sec
 Timing buffered disk reads: 310 MB in  3.01 seconds = 102.89 MB/sec

/dev/sdd:
 Timing cached reads:   13398 MB in  2.00 seconds = 6707.81 MB/sec
 Timing buffered disk reads: 310 MB in  3.01 seconds = 102.91 MB/sec

/dev/sdd:
 Timing cached reads:   12730 MB in  2.00 seconds = 6372.54 MB/sec
 Timing buffered disk reads: 306 MB in  3.01 seconds = 101.70 MB/sec

Linux software raid1

curtis:~# sfdisk -d /dev/sdc | sfdisk /dev/sdd

curtis:~# mdadm --create --verbose /dev/md2 --level 1 --raid-devices=2 /dev/sdc1 /dev/sdd1

curtis:~# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md2 : active (auto-read-only) raid1 sdd1[1] sdc1[0]
      976630336 blocks super 1.2 [2/2] [UU]
        resync=PENDING

md1 : active raid1 sda6[0] sdb6[1]
      726192960 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
      204608 blocks super 1.2 [2/2] [UU]

unused devices: <none>

Throughput

curtis:~# for i in $(seq 10); do dd if=/dev/zero of=/dev/md2 bs=512M count=1 oflag=direct; done
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.58668 s, 96.1 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.59314 s, 96.0 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.54269 s, 96.9 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.45947 s, 98.3 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.55115 s, 96.7 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.58486 s, 96.1 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.50936 s, 97.4 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.59953 s, 95.9 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.50249 s, 97.6 MB/s
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 5.54276 s, 96.9 MB/s

Latency

TEST_DEVICE=/dev/md2
for i in `seq 1 10`; do
    dd if=/dev/zero of=$TEST_DEVICE bs=512 count=10000 oflag=direct
done

10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.30388 s, 3.9 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.45138 s, 3.5 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.50447 s, 3.4 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.35231 s, 3.8 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.36926 s, 3.7 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.41763 s, 3.6 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.64202 s, 3.1 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.44648 s, 3.5 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.51445 s, 3.4 MB/s
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 1.59204 s, 3.2 MB/s

Filesystem

Tests for 2 XFS filesystems, one taking the default builtin values, and the other specifying that the device is actually built out of 2 disks.

Bonnie++ stats

mkfs.xfs -f /dev/md2

curtis:~# bonnie++ -d /0 -s 320G -u malin:pet -n 0 -f -b

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
curtis         320G           96669   8 45616   5           106159   7 117.3  13
Latency                         190ms     544ms               617ms     281ms
curtis:~# bonnie++ -d /0 -s 70G -u malin:pet -n 10 -b

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
curtis          70G  1339  96 98427   8 47280   5  2572  96 108585   6 167.2  13
Latency              5939us   15708us     435ms   11085us     138ms     295ms
Version  1.96       ------Sequential Create------ --------Random Create--------
curtis              -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 10    82   0 +++++ +++    82   0    84   0 +++++ +++    84   0
Latency               108ms     101us     102ms     108ms      10us     155ms

mkfs.xfs -d su=256k,sw=1 -l version=2,su=256k /dev/md2

curtis:~# bonnie++ -d /0 -s 320G -u malin:pet -n 0 -f -b
Using uid:1120, gid:200.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
curtis         320G           97176   8 45935   5           111214   7 121.4  12
Latency                       98164us    1666ms               464ms     289ms
curtis:~# bonnie++ -d /0 -s 70G -u malin:pet -n 10 -b

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
curtis          70G  1351  95 98449   8 47192   5  2525  95 111391   7 168.0  13
Latency              5851us   15707us     608ms   11945us     340ms     253ms
Version  1.96       ------Sequential Create------ --------Random Create--------
curtis              -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 10    64   1 +++++ +++    65   1    65   1 +++++ +++    63   1
Latency             96549us      99us     169ms     120ms      10us     350ms