Raid gone

  • Hello, first of all big love to the OMW devs and community! Been using it since I got into homeservers (since OMW v2)


    Short story: For storage I have been running a raid 5 with 4x4TB WD REDs for years. THEN I had a power shortage and one disk died. When I replaced the broken disk the raid disappeared from the web UI. I am currently in full panic that years of backups have been lost.


    From before i swapped disks and the raid disappeared:


    cat /proc/mdstat:


    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

    md127 : inactive sda[1](S) sdc[3](S) sdb[2](S)

    11720662536 blocks super 1.2

    unused devices: <none>


    blkid:


    /dev/sdb: UUID="478fbf22-daf1-9758-1264-80b1d486c52c" UUID_SUB="39909b78-25bf-765f-8683-2df664608779" LABEL="Jocke-Microserver:raid5" TYPE="linux_raid_member"

    /dev/sda: UUID="478fbf22-daf1-9758-1264-80b1d486c52c" UUID_SUB="7c8f1a0c-d9fb-f4be-5108-0bd2d4fd7bef" LABEL="Jocke-Microserver:raid5" TYPE="linux_raid_member"

    /dev/sdc: UUID="478fbf22-daf1-9758-1264-80b1d486c52c" UUID_SUB="cb425cbd-23b5-0344-7980-615016972c0d" LABEL="Jocke-Microserver:raid5" TYPE="linux_raid_member"

    /dev/sdd1: UUID="8a18c2c0-fecd-4924-a9a4-0790640a774e" TYPE="ext4" PARTUUID="57716ba9-01"

    /dev/sdd5: UUID="2bbaa483-2349-4d9f-ad57-cdae88500dce" TYPE="swap" PARTUUID="57716ba9-05"


    fdisk -l | grep "Disk ":


    Disk /dev/sdb: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    Disk /dev/sdc: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    Disk /dev/sdd: 111.8 GiB, 120034123776 bytes, 234441648 sectors

    Disk model: Samsung SSD 840

    Disk identifier: 0x57716ba9

    Disk /dev/sde: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFAX-68J


    cat /etc/mdadm/mdadm.conf:


    # This file is auto-generated by openmediavault (https://www.openmediavault.org)

    # WARNING: Do not edit this file, your changes will get lost.

    # mdadm.conf

    #

    # Please refer to mdadm.conf(5) for information about this file.

    #

    # by default, scan all partitions (/proc/partitions) for MD superblocks.

    # alternatively, specify devices to scan, using wildcards if desired.

    # Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.

    # To avoid the auto-assembly of RAID devices a pattern that CAN'T match is

    # used if no RAID devices are configured.

    DEVICE partitions

    # auto-create devices with Debian standard permissions

    CREATE owner=root group=disk mode=0660 auto=yes

    # automatically tag new arrays as belonging to the local system

    HOMEHOST <system>

    # definitions of existing MD arrays


    mdadm --detail --scan --verbose:


    INACTIVE-ARRAY /dev/md127 num-devices=3 metadata=1.2 name=Jocke-Microserver:raid5 UUID=478fbf22:daf19758:126480b1:d486c52c

    devices=/dev/sda,/dev/sdb,/dev/sdc


    Please try to be precise if you need me to post something else/do something since my knowledge of terminal commands and usage is basically copy&paste.


    Thanks in advance

    • Offizieller Beitrag

    ssh into omv as root and execute the following


    mdadm --stop /dev/md127 wait for output then,


    mdadm --assemble --force --verbose /dev/md127 /dev/sd[abc] this should get the raid started in a clean/degraded state


    The output from fdisk would suggest that /dev/sde is the new drive, if so in the GUI;


    Storage -> Disks select the new drive and on the menu click wipe, short should be enough, wait until it's completed before proceeding.


    Raid Management -> select the raid and the menu click recover, a dialog should display with the new drive, select it and click OK, the array should now rebuild.

  • Thanks for the quick reply! However something didnt work.

    mdadm --stop /dev/md127:
    mdadm: stopped /dev/md127


    mdadm --assemble --force --verbose /dev/md127 /dev/sd[abc]:
    mdadm: looking for devices for /dev/md127

    mdadm: No super block found on /dev/sdb (Expected magic a92b4efc, got 00000000)

    mdadm: no RAID superblock on /dev/sdb

    mdadm: /dev/sdb has no superblock - assembly aborted


    EDIT:
    I think the "labels" might have changed cus I rebooted the system after I wrote the post.


    NEW fdisk -l | grep "Disk ":

    Disk /dev/sdc: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    Disk /dev/sdd: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    Disk /dev/sde: 111.8 GiB, 120034123776 bytes, 234441648 sectors

    Disk model: Samsung SSD 840

    Disk identifier: 0x57716ba9

    Disk /dev/sdb: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFAX-68J

    Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    • Offizieller Beitrag

    Thanks for the quick reply! However something didnt work.

    Then I'm afraid it's dead, you only have 3 drives out of a possible 4 in the raid 5 that are showing as a linux_raid_member, your raid was inactive so the only option is to re assemble, as this failed due to an error on another drive the raid is lost.


    Not sure if running fsck on that drive might fix it, fsck /dev/sdb or trying TestDisk but at present it's toast

  • Then I'm afraid it's dead

    Are you sure? Becuase /dev/sdb where it says there is no super block is the new disk.


    Disk /dev/sdc: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    Disk /dev/sdd: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    Disk /dev/sde: 111.8 GiB, 120034123776 bytes, 234441648 sectors

    Disk model: Samsung SSD 840

    Disk identifier: 0x57716ba9

    Disk /dev/sdb: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFAX-68J

    Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors

    Disk model: WDC WD40EFRX-68W

    WDC WD40EFAX-68J = /dev/sdb should be the new since it differs in name from the 3 others

    • Offizieller Beitrag

    I think the "labels" might have changed cus I rebooted the system after I wrote the post.


    why edit a post, and why reboot, I can't see your edit whilst I am writing a reply, and never, ever, ever reboot unless the person who is trying to help you tells you to.


    Now we are back at square one!!!!


    cat /proc/mdstat

    blkid

    mdadm --detail /dev/md127

  • Sorry!


    cat /proc/mdstat:

    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

    unused devices: <none>

    Empty becuse I did mdadm --stop /dev/md127 earlier?

    blkid:

    /dev/sdc: UUID="478fbf22-daf1-9758-1264-80b1d486c52c" UUID_SUB="39909b78-25bf-765f-8683-2df664608779" LABEL="Jocke-Microserver:raid5" TYPE="linux_raid_member"

    /dev/sdd: UUID="478fbf22-daf1-9758-1264-80b1d486c52c" UUID_SUB="cb425cbd-23b5-0344-7980-615016972c0d" LABEL="Jocke-Microserver:raid5" TYPE="linux_raid_member"

    /dev/sde1: UUID="8a18c2c0-fecd-4924-a9a4-0790640a774e" TYPE="ext4" PARTUUID="57716ba9-01"

    /dev/sde5: UUID="2bbaa483-2349-4d9f-ad57-cdae88500dce" TYPE="swap" PARTUUID="57716ba9-05"

    /dev/sda: UUID="478fbf22-daf1-9758-1264-80b1d486c52c" UUID_SUB="7c8f1a0c-d9fb-f4be-5108-0bd2d4fd7bef" LABEL="Jocke-Microserver:raid5" TYPE="linux_raid_member"


    mdadm --detail /dev/md127:

    mdadm: cannot open /dev/md127: No such file or directory
    same reason as cat /proc/mdstat?

  • mdadm --assemble --force --verbose /dev/md127 /dev/sd[acd]:


    mdadm: looking for devices for /dev/md127

    mdadm: /dev/sda is identified as a member of /dev/md127, slot 1.

    mdadm: /dev/sdc is identified as a member of /dev/md127, slot 2.

    mdadm: /dev/sdd is identified as a member of /dev/md127, slot 3.

    mdadm: no uptodate device for slot 0 of /dev/md127

    mdadm: added /dev/sdc to /dev/md127 as 2

    mdadm: added /dev/sdd to /dev/md127 as 3

    mdadm: added /dev/sda to /dev/md127 as 1

    mdadm: /dev/md127 has been started with 3 drives (out of 4).


    The raid show in the webui again!


    Next step for me is what you wrote in the 1st post now? Clean new disk and then proceed to recover the raid?

  • Yes, do not reboot, do not pass go until raid has rebuilt :) OH and make sure you select the correct drive to wipe :) to fully recover the raid

    Awesome! The raid seems to be rebuilding correctly now, hopefully it works like it should.


    Can't thank you enough for your time and effort in helping me! Expect a donation to the OMW Project from me very soon, the least I can do!


    EDIT: Raid recovered succsesfully!

    3 Mal editiert, zuletzt von Mychomizer () aus folgendem Grund: typo solved

  • Mychomizer

    Hat das Label gelöst hinzugefügt.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!