walter
hardware
2 disks are used for the Linux software raid1: 2x Seagate ST3600057SS Cheetah 15K.7, 600GB, SAS 6Gb/s, 15000RPM.
[8:0:0:0] disk SEAGATE ST373455SS 0004 /dev/sda [8:0:1:0] disk SEAGATE ST373455SS 0004 /dev/sdb [8:0:2:0] disk SEAGATE ST3600057SS 0008 /dev/sdc [8:0:3:0] disk SEAGATE ST3600057SS 000B /dev/sdd
They are set in a passthrough mode on the onboard LSI SAS1068E PCI-Express card (which is a piece of crap if you really want to know my opinion).
walter:~# smartctl -i /dev/sdc smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net Vendor: SEAGATE Product: ST3600057SS Revision: 0008 User Capacity: 600,127,266,816 bytes [600 GB] Logical block size: 512 bytes Logical Unit id: 0x5000c500684a6b1b Serial number: 6SL6CB0Z0000B3450WS9 Device type: disk Transport protocol: SAS Local Time is: Fri Nov 22 13:24:37 2013 EST Device supports SMART and is Enabled Temperature Warning Enabled walter:~# smartctl -i /dev/sdd smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net Vendor: SEAGATE Product: ST3600057SS Revision: 000B User Capacity: 600,127,266,816 bytes [600 GB] Logical block size: 512 bytes Logical Unit id: 0x5000c5003ca55d23 Serial number: 6SL5ZPNT0000N33410SL Device type: disk Transport protocol: SAS Local Time is: Fri Nov 22 13:25:09 2013 EST Device supports SMART and is Enabled Temperature Warning Enabled
hdparm
walter:~# for i in $(seq 1 5); do hdparm -tT /dev/sdc; done /dev/sdc: Timing cached reads: 12234 MB in 2.00 seconds = 6123.88 MB/sec Timing buffered disk reads: 586 MB in 3.00 seconds = 195.12 MB/sec /dev/sdc: Timing cached reads: 13474 MB in 2.00 seconds = 6745.69 MB/sec Timing buffered disk reads: 586 MB in 3.01 seconds = 194.90 MB/sec /dev/sdc: Timing cached reads: 13370 MB in 2.00 seconds = 6694.24 MB/sec Timing buffered disk reads: 586 MB in 3.00 seconds = 195.26 MB/sec /dev/sdc: Timing cached reads: 13232 MB in 2.00 seconds = 6624.72 MB/sec Timing buffered disk reads: 586 MB in 3.00 seconds = 195.23 MB/sec /dev/sdc: Timing cached reads: 11874 MB in 2.00 seconds = 5944.10 MB/sec Timing buffered disk reads: 586 MB in 3.00 seconds = 195.02 MB/sec walter:~# for i in $(seq 1 5); do hdparm -tT /dev/sdd; done /dev/sdd: Timing cached reads: 13092 MB in 2.00 seconds = 6554.64 MB/sec Timing buffered disk reads: 592 MB in 3.01 seconds = 196.76 MB/sec /dev/sdd: Timing cached reads: 13438 MB in 2.00 seconds = 6727.67 MB/sec Timing buffered disk reads: 592 MB in 3.01 seconds = 196.81 MB/sec /dev/sdd: Timing cached reads: 12336 MB in 2.00 seconds = 6175.49 MB/sec Timing buffered disk reads: 590 MB in 3.00 seconds = 196.65 MB/sec /dev/sdd: Timing cached reads: 13220 MB in 2.00 seconds = 6618.50 MB/sec Timing buffered disk reads: 592 MB in 3.01 seconds = 196.74 MB/sec /dev/sdd: Timing cached reads: 13308 MB in 2.00 seconds = 6662.68 MB/sec Timing buffered disk reads: 590 MB in 3.00 seconds = 196.54 MB/sec
Linux Software RAID1
Construct the mirror from the two partitions sdc1 and sdd1:
Disk /dev/sdc: 600.1 GB, 600127266816 bytes 81 heads, 63 sectors/track, 229693 cylinders, total 1172123568 sectors Device Boot Start End Blocks Id System /dev/sdc1 2048 1172123567 586060760 fd Linux raid autodetect Disk /dev/sdd: 600.1 GB, 600127266816 bytes 255 heads, 63 sectors/track, 72961 cylinders, total 1172123568 sectors Device Boot Start End Blocks Id System /dev/sdd1 2048 1172123567 586060760 fd Linux raid autodetect
mdadm --create --verbose /dev/md2 --level 1 --raid-devices=2 /dev/sdc1 /dev/sdd1
Throughput:
TEST_DEVICE=/dev/md2 for i in $(seq 10); do dd if=/dev/zero of=$TEST_DEVICE bs=512M count=1 oflag=direct done 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.93126 s, 183 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.95495 s, 182 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.96159 s, 181 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.9579 s, 182 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.94575 s, 182 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.93007 s, 183 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.92994 s, 183 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.92594 s, 183 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.94176 s, 183 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 2.94557 s, 182 MB/s
Latency
TEST_DEVICE=/dev/md2 for i in `seq 1 10`; do dd if=/dev/zero of=$TEST_DEVICE bs=512 count=10000 oflag=direct done 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.675977 s, 7.6 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.770088 s, 6.6 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.731074 s, 7.0 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.798559 s, 6.4 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.746383 s, 6.9 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.776625 s, 6.6 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.74706 s, 6.9 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.744349 s, 6.9 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.80117 s, 6.4 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 0.782879 s, 6.5 MB/s
From the DRBD doc, verbatim: It is important to understand that throughput measurements generated by dd are completely irrelevant for this test; what is important is the time elapsed during the completion of said 1,000 writes. Dividing this time by 1,000 gives the average latency of a single sector write.
Filesystem
Tests for 2 XFS filesystems, one taking the default builtin values, and the other specifying that the device is actually built out of 2 disks.
Bonnie++ stats
mkfs.xfs -f /dev/md2
walter:~/bonnie# bonnie++ -d /0 -s 320G -u malin:pet -n 0 -f -b Using uid:1120, gid:200. Writing intelligently...done Rewriting...done Reading intelligently...done start 'em...done...done...done...done...done... Version 1.96 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP walter 320G 184927 17 84987 10 192036 12 259.0 24 Latency 146ms 163ms 20511us 125ms
Another run, but with file access testing. From the manpage, about the option -f size-for-char-io:
-f size-for-char-io fast mode control, skips per-char IO tests if no parameter, otherwise specifies the size of the tests for per-char IO tests (default 20M).
walter:~# bonnie++ -d /0 -s 70G -u malin:pet -n 10 -b Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP walter 70G 1392 98 194179 17 89058 10 2653 97 203437 13 378.9 29 Latency 5969us 14743us 274ms 10531us 33523us 118ms Version 1.96 ------Sequential Create------ --------Random Create-------- walter -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 10 228 2 +++++ +++ 228 2 228 2 +++++ +++ 227 2 Latency 48159us 154us 40058us 35547us 10us 77927us
mkfs.xfs -d su=256k,sw=1 -l version=2,su=256k /dev/md2
walter:~# bonnie++ -d /0 -s 320G -u malin:pet -n 0 -f -b Version 1.96 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP walter 320G 185656 16 85433 10 197649 13 252.9 26 Latency 40940us 262ms 20822us 117ms
walter:~# bonnie++ -d /0 -s 70G -u malin:pet -n 0 -f -b Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP walter 70G 193336 17 89251 10 201082 12 488.0 9 Latency 14732us 170ms 13307us 150ms
curtis
hardware
The system disks are old Prolient HP GB0750C4414 750GB Hot-Plug SATA 1.5GB/s Hard Drive 7200 RPM 3.5 inch, with a transfer rate of 150MB/s. Old stuff.
The 2 disks that will compose the RAID1 are Seagate Barracuda 1000GB / 1TB SATA Hard Disks ST31000340NS:
curtis:~# lsscsi [6:0:1:0] cd/dvd TEAC DV-28E-V 1.AB /dev/sr0 [8:0:0:0] disk ATA GB0750C4414 HPG3 /dev/sda [8:0:1:0] disk ATA GB0750C4414 HPG3 /dev/sdb [8:0:2:0] disk ATA ST31000340NS SN04 /dev/sdc [8:0:3:0] disk ATA ST31000340NS SN05 /dev/sdd
curtis:~# smartctl -i /dev/sdc smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda ES.2 Device Model: ST31000340NS Serial Number: 3QJ09VAA LU WWN Device Id: 5 000c50 00a106b89 Firmware Version: SN04 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 6 ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2 Local Time is: Fri Nov 22 13:10:40 2013 EST ==> WARNING: There are known problems with these drives, see the following Seagate web pages: http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931 http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207963 SMART support is: Available - device has SMART capability. SMART support is: Enabled curtis:~# smartctl -i /dev/sdd smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda ES.2 Device Model: ST31000340NS Serial Number: 9QJ1L7BW LU WWN Device Id: 5 000c50 00d902450 Firmware Version: SN05 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Fri Nov 22 13:19:29 2013 EST ==> WARNING: There are known problems with these drives, see the following Seagate web pages: http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931 http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207963 SMART support is: Available - device has SMART capability. SMART support is: Enabled
The disk with SN 3QJ09VAA (sdc above) is a refurbished one.
hdparm
curtis:~# for i in $(seq 1 5); do hdparm -tT /dev/sdc; done /dev/sdc: Timing cached reads: 11800 MB in 2.00 seconds = 5906.57 MB/sec Timing buffered disk reads: 318 MB in 3.02 seconds = 105.14 MB/sec /dev/sdc: Timing cached reads: 11758 MB in 2.00 seconds = 5885.43 MB/sec Timing buffered disk reads: 318 MB in 3.01 seconds = 105.60 MB/sec /dev/sdc: Timing cached reads: 13196 MB in 2.00 seconds = 6606.76 MB/sec Timing buffered disk reads: 318 MB in 3.01 seconds = 105.48 MB/sec /dev/sdc: Timing cached reads: 12500 MB in 2.00 seconds = 6257.93 MB/sec Timing buffered disk reads: 316 MB in 3.02 seconds = 104.76 MB/sec /dev/sdc: Timing cached reads: 12548 MB in 2.00 seconds = 6281.66 MB/sec Timing buffered disk reads: 324 MB in 3.01 seconds = 107.74 MB/sec
Doing the same for sdd makes the system barf. Either the disk is bad or the backplane is misbehaving. I’m restarting the system without the Seagates and will test them one at a time. One is a refurbished disk so…
After some tests, turns out that disk SN 9QJ1L7BW is bad. Swapped for a new Seagate Barracuda ES.2:
curtis:~# smartctl -i /dev/sdd smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.13-i686-64-smp] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda ES.2 Device Model: ST31000340NS Serial Number: 9QJ2RJSV LU WWN Device Id: 5 000c50 03cde3cd1 Firmware Version: SN06 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Fri Nov 22 14:19:39 2013 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled
curtis:~# for i in $(seq 1 5); do hdparm -tT /dev/sdd; done /dev/sdd: Timing cached reads: 11764 MB in 2.00 seconds = 5888.81 MB/sec Timing buffered disk reads: 308 MB in 3.00 seconds = 102.66 MB/sec /dev/sdd: Timing cached reads: 12594 MB in 2.00 seconds = 6304.25 MB/sec Timing buffered disk reads: 310 MB in 3.01 seconds = 103.15 MB/sec /dev/sdd: Timing cached reads: 12634 MB in 2.00 seconds = 6325.03 MB/sec Timing buffered disk reads: 310 MB in 3.01 seconds = 102.89 MB/sec /dev/sdd: Timing cached reads: 13398 MB in 2.00 seconds = 6707.81 MB/sec Timing buffered disk reads: 310 MB in 3.01 seconds = 102.91 MB/sec /dev/sdd: Timing cached reads: 12730 MB in 2.00 seconds = 6372.54 MB/sec Timing buffered disk reads: 306 MB in 3.01 seconds = 101.70 MB/sec
Linux software raid1
curtis:~# sfdisk -d /dev/sdc | sfdisk /dev/sdd curtis:~# mdadm --create --verbose /dev/md2 --level 1 --raid-devices=2 /dev/sdc1 /dev/sdd1 curtis:~# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] md2 : active (auto-read-only) raid1 sdd1[1] sdc1[0] 976630336 blocks super 1.2 [2/2] [UU] resync=PENDING md1 : active raid1 sda6[0] sdb6[1] 726192960 blocks super 1.2 [2/2] [UU] md0 : active raid1 sda1[0] sdb1[1] 204608 blocks super 1.2 [2/2] [UU] unused devices: <none>
Throughput
curtis:~# for i in $(seq 10); do dd if=/dev/zero of=/dev/md2 bs=512M count=1 oflag=direct; done 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.58668 s, 96.1 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.59314 s, 96.0 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.54269 s, 96.9 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.45947 s, 98.3 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.55115 s, 96.7 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.58486 s, 96.1 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.50936 s, 97.4 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.59953 s, 95.9 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.50249 s, 97.6 MB/s 1+0 records in 1+0 records out 536870912 bytes (537 MB) copied, 5.54276 s, 96.9 MB/s
Latency
TEST_DEVICE=/dev/md2 for i in `seq 1 10`; do dd if=/dev/zero of=$TEST_DEVICE bs=512 count=10000 oflag=direct done 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.30388 s, 3.9 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.45138 s, 3.5 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.50447 s, 3.4 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.35231 s, 3.8 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.36926 s, 3.7 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.41763 s, 3.6 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.64202 s, 3.1 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.44648 s, 3.5 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.51445 s, 3.4 MB/s 10000+0 records in 10000+0 records out 5120000 bytes (5.1 MB) copied, 1.59204 s, 3.2 MB/s
Filesystem
Tests for 2 XFS filesystems, one taking the default builtin values, and the other specifying that the device is actually built out of 2 disks.
Bonnie++ stats
mkfs.xfs -f /dev/md2
curtis:~# bonnie++ -d /0 -s 320G -u malin:pet -n 0 -f -b Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP curtis 320G 96669 8 45616 5 106159 7 117.3 13 Latency 190ms 544ms 617ms 281ms
curtis:~# bonnie++ -d /0 -s 70G -u malin:pet -n 10 -b Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP curtis 70G 1339 96 98427 8 47280 5 2572 96 108585 6 167.2 13 Latency 5939us 15708us 435ms 11085us 138ms 295ms Version 1.96 ------Sequential Create------ --------Random Create-------- curtis -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 10 82 0 +++++ +++ 82 0 84 0 +++++ +++ 84 0 Latency 108ms 101us 102ms 108ms 10us 155ms
mkfs.xfs -d su=256k,sw=1 -l version=2,su=256k /dev/md2
curtis:~# bonnie++ -d /0 -s 320G -u malin:pet -n 0 -f -b Using uid:1120, gid:200. Writing intelligently...done Rewriting...done Reading intelligently...done start 'em...done...done...done...done...done... Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP curtis 320G 97176 8 45935 5 111214 7 121.4 12 Latency 98164us 1666ms 464ms 289ms
curtis:~# bonnie++ -d /0 -s 70G -u malin:pet -n 10 -b Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP curtis 70G 1351 95 98449 8 47192 5 2525 95 111391 7 168.0 13 Latency 5851us 15707us 608ms 11945us 340ms 253ms Version 1.96 ------Sequential Create------ --------Random Create-------- curtis -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 10 64 1 +++++ +++ 65 1 65 1 +++++ +++ 63 1 Latency 96549us 99us 169ms 120ms 10us 350ms