raid5 - after changing disk clean/FAILED

  • Ups, I degraded a clean and working raid5 to renew a disk (sdb). Now I get an error: clean/FAILED, after installing a new disk. If I look in desciption I see this:


    So what can I do? I reinstalled the old sdb, but cannot get md127 back to work.

    I managed to change two disks today, but the third tries to make me unhappy.


    Thnx allot for your help


    cat /proc/mdstat

    Code
    root@omv:~# cat /proc/mdstat
    Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
    md127 : active raid5 sdc[4](F) sdd[5](F) sda[1](F)
          5860148736 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/0] [____]
    
    unused devices: <none>


    blkid

    Code
    root@omv:~# blkid
    /dev/sde1: UUID="e10f8611-e537-471c-a7a2-93b2774b2e2d" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="b84e6387-01"
    /dev/sde5: UUID="0fc5a379-2025-4b9d-bb58-c18e876d19f5" TYPE="swap" PARTUUID="b84e6387-05"
    /dev/sda: UUID="813b7562-8521-b311-f237-8373d31cba65" UUID_SUB="b9681117-d365-a4c2-17ca-57d0196c0cdc" LABEL="omv:Datengrab" TYPE="linux_raid_member"
    /dev/md127: LABEL="Datengrab" UUID="428b3002-db0b-4935-a396-5dc9043e595d" BLOCK_SIZE="4096" TYPE="ext4"
    /dev/sdd: UUID="813b7562-8521-b311-f237-8373d31cba65" UUID_SUB="70b96ef1-a538-1392-d319-b0c142dec417" LABEL="omv:Datengrab" TYPE="linux_raid_member"
    /dev/sdc: UUID="813b7562-8521-b311-f237-8373d31cba65" UUID_SUB="757004c4-c94a-de1f-5a8b-59ce64206abb" LABEL="omv:Datengrab" TYPE="linux_raid_member"
    /dev/sdb: UUID="813b7562-8521-b311-f237-8373d31cba65" UUID_SUB="cdc95198-a4f0-d1c9-6e2a-18127e7b5510" LABEL="omv:Datengrab" TYPE="linux_raid_member"

    tbc

    Auch das geht vorbei ...

    Edited once, last by Pfeifenraucher ().

  • fdisk -l | grep "Disk "


    cat /etc/mdadm.conf


    mdadm --detail --scan --verbose

    Code
    root@omv:~# mdadm --detail --scan --verbose
    ARRAY /dev/md/Datengrab level=raid5 num-devices=4 metadata=1.2
    devices=/dev/sda,/dev/sdc,/dev/sdd


    mdadm --assemble --force --verbose /dev/md127 /dev/sd[a1b1c1d1]

    Code
    root@omv:~# mdadm --assemble --force --verbose /dev/md127 /dev/sd[a1b1c1d1]
    mdadm: looking for devices for /dev/md127
    mdadm: /dev/sda is busy - skipping
    mdadm: /dev/sdc is busy - skipping
    mdadm: /dev/sdd is busy - skipping
    mdadm: Found some drive for an array that is already active: /dev/md/Datengrab
    mdadm: giving up.


    I hope you can help me...

    I found out, data is still there but only read-only

    Auch das geht vorbei ...

    Edited 3 times, last by Pfeifenraucher ().

  • The whole of the above makes little or no sense at all, changing a failing drive using the GUI is very simple and straightforward


    This -> I managed to change two disks today, but the third tries to make me unhappy. makes no sense, Raid5 can only suffer one drive failure remove/replace two at the same time the array is toast.


    The output from cat /proc/mdstat shows three drives /dev/sd[acd] all with (F) so mdadm believes these drives have failed; but; blkid shows /dev/sd[abcd] as being linux raid members.............................confusing isn't it.


    This -> I found out, data is still there but only read-only how?? if you have a working crystal ball them some us would like to know where you got it from.


    Apologies for being sarcastic, but the whole thing is very hard to understand, so;


    mdadm --stop /dev/md127


    mdadm --assemble --force --verbose /dev/md127 /dev/sd[abcd]


    post any errors, do not reboot!!

    Raid is not a backup! Would you go skydiving without a parachute?


    OMV 6x amd64 running on an HP N54L Microserver

  • I know I've got the red nose today - even I really waited until raid is clean again before I took the next step aka disk - one good, two good, three - here we are...


    1. Raid md127 remove sdd-old, save and wait until omv6 wants to save the change as well

    2. remove sdd-old

    3. plugin sdd-new, and wait a little bit -> drive is shown at disk-section

    4. enable SMART-monitoring

    5. swraid recover md127; add sdd-new; save and wait until omv6 wants to save the change as well

    6. wait until recovery is finnished (took about 5h)


    Go back to 1. and take sdc-old/sdc-new...


    Next should be sdb-old/sdb-new and last for surprise sda-old/sda-new.


    mdadm --stop /dev/md127

    Code
    root@omv:~# mdadm --stop /dev/md127
    mdadm: Cannot get exclusive access to /dev/md127:Perhaps a running process, mounted filesystem or active volume group?


    mdadm --assemble --force --verbose /dev/md127 /dev/sd[abcd]

    Code
    root@omv:~# mdadm --assemble --force --verbose /dev/md127 /dev/sd[abcd]
    mdadm: looking for devices for /dev/md127
    mdadm: /dev/sda is busy - skipping
    mdadm: /dev/sdc is busy - skipping
    mdadm: /dev/sdd is busy - skipping
    mdadm: Found some drive for an array that is already active: /dev/md/Datengrab
    mdadm: giving up.


    Last I've seen already in post #2 - ok, a1b1c1d1 instead of abcd.

    Auch das geht vorbei ...

    Edited 2 times, last by Pfeifenraucher ().

  • 1. Raid md127 remove sdd-old,

    How, I am using OMV6 but I'm not using Raid and could spin up a VM but that would have to wait until tomorrow, but from memory in V5 there was an option on the Raid menu to delete/remove, that's the way a drive has to be, must be removed from an array. When done that way the array is in a clean/degraded state, so the shares and files are accessible.

    a1b1c1d1 instead of abcd

    no a1b1 etc are drive partitions, OMV does not use partitions when creating an array it uses the full drive.


    -------------------------------------------------------------------------------------------------------------------------------------------


    This output -> mdadm: Cannot get exclusive access to /dev/md127:Perhaps a running process, mounted filesystem or active volume group

    suggests that something/someone still has access to the array that's why it cannot be stopped.


    If you have the kernel plugin installed you could try using the SystemRescue option, this installs systemrescuecd and boots to that once, you can then run cli commands to assemble the array without omv running, exit and reboot will get back to omv. But, you need to check again options such as blkid.

    Raid is not a backup! Would you go skydiving without a parachute?


    OMV 6x amd64 running on an HP N54L Microserver

  • Ok, i stopped nfs & docker and afterwards umount -f /dev/mdstat.



    Now md127 is online again - but not mounted yet. In the morning i'll try to switch and add the newdisk to md127, hopefully the rebuild will work.


    Thnx for your efforts today, there will be an early bird tomorrow.

    Auch das geht vorbei ...

    Edited once, last by Pfeifenraucher ().

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!