running spin-rite on each disk right now - disks 0 to 8 are okay ....
slow process though ... like 8~9 hours per disc
i'll report back as soon as i am finished - probably on sunday evening
cheers and thx for your help - ahab666
running spin-rite on each disk right now - disks 0 to 8 are okay ....
slow process though ... like 8~9 hours per disc
i'll report back as soon as i am finished - probably on sunday evening
cheers and thx for your help - ahab666
Take care that nothing fiddles around with the superblocks while testing. As long as they are untouched the raid informations (And your data) written on the disks by mdadm when the raid was build can be discovered.
Yeah, hence I suggested only to check the disk which was giving problems.
The point is you need to get the raid back and make asap backup of your most important data!
Well, that's how I would do it. I think there is an option to only scan in Spin-rite, but not sure.
I always prefer testing programs made by the manufacturer of the disks at the first line, they are tailored for their products.
After reading all the postings I believe that only one disk may have problems.
./edit: This thread should be moved to the /Storage/Raid subforum.
@ datadigger :
root@OMV:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : inactive sdb[0] sdl[11] sdk[10] sdj[9] sdi[12] sdh[6] sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
32231490632 blocks super 1.2
how can i reactivate the inactive hdds ? and how can i add sd?[7] to the reactivated raid ???
okay- i followed your advices and just tested the probably defective one sd?[7] - and no nothing found with the wd-tool as well as with spinrite as well as with HDDU ....
so any other telnet or gui commands that can help me find and reuse my data again ???
cheers - alex
Hi Alex,
try this:
logout of your gui.
open a ssh and enter these two commands:
After that provide output of:
Then log back in gui and tell me the status of your raid.
Alex, regards disk 7, I think you really need to zero-wipe that disk with DBAN as I recommended previously, with a SMART check before and verify after each sector write then when done and no errors turn up your lucky and disk is fine, why? If there is something wrong with the raid information on that disk [Super-block or whatsoever] the disk will be ignored it doesn't matter how often you take it out and put back, it is marked bad! A zero-wipe will let the raid believe it is a new disk and there is nothing in the way to rebuild the raid again.
Assuming DBAN found no errors then at least you know that the disk is fine and your Super-block was damaged.
So that are 2 different things, I really hope you understand this?
- Bad disk with damaged sectors or cluster.
- Bad or damaged Super-block and the raid will not accept that disk!
@Wabun ...
root@OMV:~# mdadm --assemble /dev/md127 /dev/sd[bcdefghijklm] --verbose --force
mdadm: looking for devices for /dev/md127
mdadm: /dev/sdb is busy - skipping
mdadm: /dev/sdc is busy - skipping
mdadm: /dev/sdd is busy - skipping
mdadm: /dev/sde is busy - skipping
mdadm: /dev/sdf is busy - skipping
mdadm: /dev/sdg is busy - skipping
mdadm: /dev/sdh is busy - skipping
mdadm: /dev/sdi is busy - skipping
mdadm: /dev/sdj is busy - skipping
mdadm: /dev/sdk is busy - skipping
mdadm: /dev/sdl is busy - skipping
mdadm: /dev/sdm is busy - skipping
root@OMV:~#
Alles anzeigen
and
root@OMV:~# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-3.2.0-4-amd64
mdadm: cannot open /dev/md/OMV: No such file or directory
mdadm: cannot open /dev/md/OMV: No such file or directory
and
root@OMV:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md126 : inactive sdm[13](S)
2930135512 blocks super 1.2
md127 : inactive sdb[0] sdl[11] sdk[10] sdj[9] sdi[12] sdh[6] sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
32231490632 blocks super 1.2
unused devices: <none>
Alles anzeigen
and
root@OMV:~# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda1 during installation
UUID=6ac66484-42b3-48ad-8430-072852de03ab / ext4 errors=remount-ro 0 1
# swap was on /dev/sda5 during installation
UUID=6edd6d3a-11b9-4188-848c-0c2f2c9a73fa none swap sw 0 0
/dev/sdb1 /media/usb0 auto rw,user,noauto 0 0
# >>> [openmediavault]
# <<< [openmediavault]
Alles anzeigen
md 126 looks wiered
looks like harddisks number 7 and 8 are missing in the "old" md127 array , hdd seven was inactive and hdd8 became a member of the md126 array - one i never built ...
cheers - alex
No need to start initramfs before the raid is complete. When mdadm sees all the disks and the raid is complete, it automatically starts a rebuild. The initramfs command adds it to the boot image.
@ahab: Stop that raid 126 with mdadm --stop /dev/md126, that kills the raid126 and frees the disk 8. Then add it manually to the raid 127 with mdadm --manage /dev/md127 --add /dev/sdi (Correct /dev/sd<letter> is important!).
Then do the same with disk No. 7: mdadm --manage /dev/md127 --add /dev/sdj - when I read your posts right this shoud be sdj.
If mdadm can read the disk correctly and it is ok then mdadm will start rebuilding, have a look at the web-ui.
Then you can run the initramfs command as wabun suggested. If initramfs finds an error it will tell that.
added the update-initrramfs -u in the previous message - sorry oversaw that....
rebooted, the server with shutdown -r now and opened the gui - :-\ the raid window is still empty
root@OMV:~# mdadm --stop /dev/md126
mdadm: stopped /dev/md126
root@OMV:~# mdadm --manage /dev/md127 --add /dev/sdj
mdadm: Cannot open /dev/sdj: Device or resource busy
root@OMV:~# mdadm --manage /dev/md127 --add /dev/sdi
mdadm: Cannot open /dev/sdi: Device or resource busy
well ....
You think he might have swapped disk 7 and 8 from their physical location on raid?
Just wonder why both are not recognised and been assigned to the /dev/md126
Until these two disks are not a part of the raid you can restart the box as long as you want, that won't bring it back.
These two disks have "lost the race" while the raid was assembled, udev can avoid the completion. I would start over building the raid from scratch.
At first check if these two disks responds:
smartctl -a /dev/sdi and smartctl -a /dev/sdj
to make sure that they are well-connected.
Then start over:
mdadm --stop /dev/md127 (This raid definition should now be removed from mdadm.conf)
udevadm control --stop-exec-queue
mdadm --assemble /dev/md127 /dev/sd[bcdefghijklm] --verbose --force
(If these two disks are still missing try to add them manually as stated above.
mdadm --manage /dev/md127 --add /dev/sdi
mdadm --manage /dev/md127 --add /dev/sdj)
If the raid is complete start udev:
udevadm control --start-exec-queue
Now check if the raid was build correctly:
cat /proc/mdstat
mdadm --detail --scan
If mdadm starts to rebuild, run initramfs and look for errors. If the raid was named correctly in mdadm.conf then it shouldn't spit out any error.
Just fought the same battle last weekend when I moved a raid from an old machine to a new installation, udevadm did the trick.
@Wabun :
root@OMV:~# mdadm --manage /dev/md127 --add /dev/sdi
mdadm: Cannot open /dev/sdi: Device or resource busy
root@OMV:~# mdadm --manage /dev/md127 --add /dev/sdj
mdadm: Cannot open /dev/sdj: Device or resource busy
root@OMV:~#
sorry - same as before ;-/
i will reinstall OMV agaian an check if there is any difference ...
Yeah to get the drives out of their status from busy you need to stop the service.
@Wabun :
Coderoot@OMV:~# mdadm --manage /dev/md127 --add /dev/sdi mdadm: Cannot open /dev/sdi: Device or resource busy root@OMV:~# mdadm --manage /dev/md127 --add /dev/sdj mdadm: Cannot open /dev/sdj: Device or resource busy root@OMV:~#
sorry - same as before ;-/
i will reinstall OMV agaian an check if there is any difference ...
That may lead to the same situation. Now we have to check why these two disks cannot be added to the raid.
Give the result of blkid. After all these actions to get the raid back they possibly belong to another raid definition (Like disk 8 to md126...). blkid will tell if this is the case.
He needs to stop the service and assign the drives back, mdadm assigns just a random number starting with 127 downwards. so the drives don't belong to anything yet. In the worse case scenario the Superblock is damaged, I think he really should try to stop the service and try to assign the drives, what you think?
@datadigger et al
root@OMV:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md126 : inactive sdm[13](S)
2930135512 blocks super 1.2
md127 : inactive sdb[0] sdl[11] sdk[10] sdj[9] sdi[12] sdh[6] sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
32231490632 blocks super 1.2
unused devices: <none>
root@OMV:~# mdadm --stop /dev/md127
mdadm: stopped /dev/md127
root@OMV:~# mdadm --stop /dev/md126
mdadm: stopped /dev/md126
root@OMV:~# udevadm control --stop-exec-queue
root@OMV:~# mdadm --assemble /dev/md127 /dev/sd[bcdefghijklm] --verbose --force
mdadm: looking for devices for /dev/md127
mdadm: /dev/sdb is identified as a member of /dev/md127, slot 0.
mdadm: /dev/sdc is identified as a member of /dev/md127, slot 1.
mdadm: /dev/sdd is identified as a member of /dev/md127, slot 2.
mdadm: /dev/sde is identified as a member of /dev/md127, slot 3.
mdadm: /dev/sdf is identified as a member of /dev/md127, slot 4.
mdadm: /dev/sdg is identified as a member of /dev/md127, slot 5.
mdadm: /dev/sdh is identified as a member of /dev/md127, slot 6.
mdadm: /dev/sdi is identified as a member of /dev/md127, slot 8.
mdadm: /dev/sdj is identified as a member of /dev/md127, slot 9.
mdadm: /dev/sdk is identified as a member of /dev/md127, slot 10.
mdadm: /dev/sdl is identified as a member of /dev/md127, slot 11.
mdadm: /dev/sdm is identified as a member of /dev/md127, slot 7.
mdadm: Marking array /dev/md127 as 'clean'
mdadm: added /dev/sdc to /dev/md127 as 1
mdadm: added /dev/sdd to /dev/md127 as 2
mdadm: added /dev/sde to /dev/md127 as 3
mdadm: added /dev/sdf to /dev/md127 as 4
mdadm: added /dev/sdg to /dev/md127 as 5
mdadm: added /dev/sdh to /dev/md127 as 6
mdadm: added /dev/sdm to /dev/md127 as 7 (possibly out of date)
mdadm: added /dev/sdi to /dev/md127 as 8
mdadm: added /dev/sdj to /dev/md127 as 9
mdadm: added /dev/sdk to /dev/md127 as 10
mdadm: added /dev/sdl to /dev/md127 as 11
mdadm: added /dev/sdb to /dev/md127 as 0
mdadm: /dev/md127 has been started with 11 drives (out of 12).
root@OMV:~# mdadm --manage /dev/md127 --add /dev/sdm
mdadm: added /dev/sdm
root@OMV:~# udevadm control --start-exec-queue
root@OMV:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid6 sdm[13] sdb[0] sdl[11] sdk[10] sdj[9] sdi[12] sdh[6] sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
29301350400 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/11] [UUUUUUU_UUUU]
[>....................] recovery = 0.0% (2027164/2930135040) finish=747.5min speed=65278K/sec
unused devices: <none>
root@OMV:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid6 sdm[13] sdb[0] sdl[11] sdk[10] sdj[9] sdi[12] sdh[6] sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
29301350400 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/11] [UUUUUUU_UUUU]
[>....................] recovery = 0.1% (3801924/2930135040) finish=695.9min speed=70080K/sec
unused devices: <none>
root@OMV:~# mdadm --detail --scan
ARRAY /dev/md127 metadata=1.2 spares=1 name=OMV:OMV UUID=6230a09b:2bd2f0af:b6f72e19:46e3b8b3
root@OMV:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid6 sdm[13] sdb[0] sdl[11] sdk[10] sdj[9] sdi[12] sdh[6] sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
29301350400 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/11] [UUUUUUU_UUUU]
[>....................] recovery = 0.1% (3912888/2930135040) finish=26375.9min speed=1848K/sec
unused devices: <none>
root@OMV:~# initramfs
-bash: initramfs: Kommando nicht gefunden.
root@OMV:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid6 sdm[13] sdb[0] sdl[11] sdk[10] sdj[9] sdi[12] sdh[6] sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
29301350400 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/11] [UUUUUUU_UUUU]
[>....................] recovery = 0.1% (5077276/2930135040) finish=2324.7min speed=20970K/sec
unused devices: <none>
Alles anzeigen
and
from the GUI
Raid Devices : 12
Total Devices : 12
Persistence : Superblock is persistent
Update Time : Mon Aug 24 12:55:27 2015
State : clean, degraded
Active Devices : 11
Working Devices : 11
Failed Devices : 1
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : OMV:OMV (local to host OMV)
UUID : 6230a09b:2bd2f0af:b6f72e19:46e3b8b3
Events : 41082
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
4 8 80 4 active sync /dev/sdf
5 8 96 5 active sync /dev/sdg
6 8 112 6 active sync /dev/sdh
7 0 0 7 removed
12 8 128 8 active sync /dev/sdi
9 8 144 9 active sync /dev/sdj
10 8 160 10 active sync /dev/sdk
11 8 176 11 active sync /dev/sdl
13 8 192 - faulty spare /dev/sdm
Alles anzeigen
will take some time - i guess - lets wait and see - cheers
Alex the command was: update-initramfs -u
root@OMV:~# initramfs
-bash: initramfs: Kommando nicht gefunden.
Edit: Let the raid do the work, don't touch it
I noticed it is the same disk again which failed, I hope the rebuild will fix it.
When the raid is rebuild, but before you do a reboot, you have to run the command : update-initramfs -u
Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!