Ok, what a wild ride so far.... I ran SpinRite on all the drives and sdb is toast. So, I focused on the five original disks. I did a
mdadm --assemble --run --force /dev/md127 /dev/sd[b-f]
which seemed to work. (Being that I had booted up without the old sdb, all the drive letters got reassigned.) I tried mdadm --detail /dev/md127 and got this:
/dev/md127:
Version : 1.2
Creation Time : Sun Jan 1 13:44:03 2006
Raid Level : raid5
Array Size : 7813523456 (7451.56 GiB 8001.05 GB)
Used Dev Size : 1953380864 (1862.89 GiB 2000.26 GB)
Raid Devices : 5
Total Devices : 4
Persistence : Superblock is persistent
Update Time : Thu Apr 23 16:25:59 2015
State : active, degraded
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : OMV2:OMV
UUID : 3e952187:f4e8e08a:19b763a4:cdc912c7
Events : 1881421
Number Major Minor RaidDevice State
5 8 48 0 active sync /dev/sdd
6 8 64 1 active sync /dev/sde
2 8 16 2 active sync /dev/sdb
3 8 32 3 active sync /dev/sdc
4 0 0 4 removed
Alles anzeigen
I shut the box down to add another drive in to replace the toasted sdb and when it booted, it hung for a while "Checking Quotas". When it finally finished that and booted up, I got this:
/dev/md127:
Version : 1.2
Creation Time : Sun Jan 1 13:44:03 2006
Raid Level : raid5
Array Size : 7813523456 (7451.56 GiB 8001.05 GB)
Used Dev Size : 1953380864 (1862.89 GiB 2000.26 GB)
Raid Devices : 5
Total Devices : 4
Persistence : Superblock is persistent
Update Time : Thu Apr 23 19:30:13 2015
State : clean, degraded
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : OMV2:OMV
UUID : 3e952187:f4e8e08a:19b763a4:cdc912c7
Events : 1884280
Number Major Minor RaidDevice State
5 8 64 0 active sync /dev/sde
6 8 80 1 active sync /dev/sdf
2 8 32 2 active sync /dev/sdc
3 8 48 3 active sync /dev/sdd
4 0 0 4 removed
Alles anzeigen
So far, so good. Through the webGUI, I chose the Recover option and added the new sdb to the array. The array started to do the rebuild and I went to bed.
This morning, I woke to find that the array was listed as clean, FAILED, and I could not access any files on it. It looks like it ran up against more issues with drive sdd. Here is the mdadm --detail /dev/md127:
/dev/md127:
Version : 1.2
Creation Time : Sun Jan 1 13:44:03 2006
Raid Level : raid5
Array Size : 7813523456 (7451.56 GiB 8001.05 GB)
Used Dev Size : 1953380864 (1862.89 GiB 2000.26 GB)
Raid Devices : 5
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Fri Apr 24 10:48:31 2015
State : clean, FAILED
Active Devices : 3
Working Devices : 4
Failed Devices : 1
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Name : OMV2:OMV
UUID : 3e952187:f4e8e08a:19b763a4:cdc912c7
Events : 1903164
Number Major Minor RaidDevice State
5 8 64 0 active sync /dev/sde
6 8 80 1 active sync /dev/sdf
2 8 32 2 active sync /dev/sdc
3 0 0 3 removed
4 0 0 4 removed
3 8 48 - faulty spare /dev/sdd
7 8 16 - spare /dev/sdb
Alles anzeigen
I then added another drive to the array, thinking that it would rebuild on to that, but no dice. At that point, I realized that I was back at square one, so I rebooted the box, and issued this:
mdadm --assemble --run --force /dev/md127 /dev/sd[b-f]
mdadm: forcing event count in /dev/sdd(3) from 1902723 upto 1903306
mdadm: clearing FAULTY flag for device 2 in /dev/md127 for /dev/sdd
mdadm: /dev/md127 has been started with 4 drives (out of 5) and 1 spare.
I was able to mount the array and access the files. Obviously, there is something with the sdd drive, as it seems to crap out during the rebuilding process. I am looking for the best option to go with from here. I realize that I am on the edge of the cliff with my toes danging over. If one more drive fails, I'm pooched. I am sitting here with an array that is listed as clean, degraded, recovering, but craps out during the rebuild due to a drive that if cranky. I realize that there could be a very small area on sdd that is causing problems. Here is what I've come up with for ideas:
- Run SpinRite again on the sdd drive to see it can access the trouble area. If I do this on a Level 3 or 4, it was saying that it would take about 2 weeks to complete, is it worked. Then add the sdd drive back to the array and rebuild to a spare.
- Take the sdd drive and use Clonezilla to copy it to a spare 2Tb drive that I have and use the -rescue switch. Then replace the old sdd drive with the cloned one and rebuild with a second spare drive. What I don't know, is if the rebuild will handle the missing sector differently that it did when it was getting a I/O error back from the old sdd drive during the prior rebuild attempts.
- Say screw it, and run the array on the four out of five drives, so the array is clean and degraded and copy everything off the array and go from there, with either Greyhole or SnapRAID.
Opinions?