OMV Raid missing after a reboot and RAM upgrade

  • Hello OMV team!


    First of all, thank you very much for all your efforts to make your awesome system greater and greater every day. I'm using it from the 2.X version and i'm really happy with all the UX and confort that it provides.


    Our case is that we're using right now your system in our small company for manage our data server. It's 24/7 and we never have had issues. Yesterday, we made a shutdown of the server to upgrade the RAM from 16GB to 32GB to improve performance in our use case, and after power on, one of the RAIDs of the system is marked like "Missing" and it can't be mounted again.


    In the following points I'll try to explain our configuration:


    OS Version: OMV 4.X

    Number of disks: 1x SSD of 250GB for the OS and 8x HDDs of 6TB each one for mounting 2x RAID5

    This two RAIDs are:

    • 1) Label: "RAID5" type: RAID5 disks: sda sdb sdc sdd
    • 2) Label: "RAID5B" type: RAID5 disks: sde sdf sdg sdh



    The issue is with the raid labeled as "RAID5". After the reboot it doesn't appear listed on the RAID section of your frontend and when I go to the FileSystem tab it appears like "Missing".


    On the S.M.A.R.T. tab you can see that the "sdb" disk is marked as faulty. Making some checks/tests we determine that the disk must be fully broken because it doesn't appear in some of the debug commands.


    H4ni1D.png


    Following the information and commands mentioned in this post: Degraded or missing raid array questions, you can find below the obtained information.


    --------------------------------------


    $ cat /proc/mdstat


    Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]

    md0 : inactive sda[4](S) sdd[3](S) sdc[2](S)

    17581171464 blocks super 1.2


    md1 : active raid5 sde[0] sdf[1] sdg[2] sdh[3]

    17581171200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

    bitmap: 0/44 pages [0KB], 65536KB chunk


    unused devices: <none>


    --------------------------------------


    $ blkid


    /dev/sdh: UUID="401f4331-dab4-9eea-f7af-c2017f49fd53" UUID_SUB="93afa8c8-5c7f-d9e3-bcf4-303b61acea49" LABEL="ken:RAID5B" TYPE="linux_raid_member"

    /dev/sdd: UUID="1dddc47a-5264-43ca-c826-044f2ec26555" UUID_SUB="40ec61b7-7036-5895-0337-43168ec9158d" LABEL="ken:RAID5" TYPE="linux_raid_member"

    /dev/sdg: UUID="401f4331-dab4-9eea-f7af-c2017f49fd53" UUID_SUB="3928383f-7392-07d7-ee8a-262bc47ca70c" LABEL="ken:RAID5B" TYPE="linux_raid_member"

    /dev/sda: UUID="1dddc47a-5264-43ca-c826-044f2ec26555" UUID_SUB="28cf8950-2cc0-60b7-06a4-c1a44c555039" LABEL="ken:RAID5" TYPE="linux_raid_member"

    /dev/sdc: UUID="1dddc47a-5264-43ca-c826-044f2ec26555" UUID_SUB="a7d856ad-ac8f-64b4-010a-c8a7fb7cdf8a" LABEL="ken:RAID5" TYPE="linux_raid_member"

    /dev/sde: UUID="401f4331-dab4-9eea-f7af-c2017f49fd53" UUID_SUB="95d7874a-d5fe-5000-b76e-030779c6a37b" LABEL="ken:RAID5B" TYPE="linux_raid_member"

    /dev/sdf: UUID="401f4331-dab4-9eea-f7af-c2017f49fd53" UUID_SUB="b5bffb90-28ce-3736-a218-23abbce50a7d" LABEL="ken:RAID5B" TYPE="linux_raid_member"

    /dev/sdi1: UUID="a682f238-3c95-4f4d-a097-ab2b39bd86a2" TYPE="ext4" PARTUUID="56ab785c-01"

    /dev/sdi5: UUID="38631c0a-fb19-488c-a928-16b2370efa0b" TYPE="swap" PARTUUID="56ab785c-05"

    /dev/md1: LABEL="RAID5B" UUID="d1a508ab-7776-4b96-a7d6-a54fe8de1f09" TYPE="ext4"


    -----------------------------


    $ fdisk -l | grep "Disk "


    Disk /dev/sdh: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdd: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdg: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdb: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sda: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68J

    Disk /dev/sdc: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sde: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdf: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdi: 232,9 GiB, 250059350016 bytes, 488397168 sectors

    Disk model: Samsung SSD 850

    Disklabel type: dos

    Disk identifier: 0x56ab785c

    Disk /dev/md1: 16,4 TiB, 18003119308800 bytes, 35162342400 sectors


    ----------------------


    $ cat /etc/mdadm/mdadm.conf


    # This file is auto-generated by openmediavault (https://www.openmediavault.org)

    # WARNING: Do not edit this file, your changes will get lost.


    # mdadm.conf

    #

    # Please refer to mdadm.conf(5) for information about this file.

    #


    # by default, scan all partitions (/proc/partitions) for MD superblocks.

    # alternatively, specify devices to scan, using wildcards if desired.

    # Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.

    # To avoid the auto-assembly of RAID devices a pattern that CAN'T match is

    # used if no RAID devices are configured.

    DEVICE partitions


    # auto-create devices with Debian standard permissions

    CREATE owner=root group=disk mode=0660 auto=yes


    # automatically tag new arrays as belonging to the local system

    HOMEHOST <system>

    # instruct the monitoring daemon where to send mail alerts

    MAILADDR XXX@XXX.com

    MAILFROM root


    # definitions of existing MD arrays

    ARRAY /dev/md0 metadata=1.2 name=ken:RAID5 UUID=1dddc47a:526443ca:c826044f:2ec26555

    ARRAY /dev/md1 metadata=1.2 name=ken:RAID5B UUID=401f4331:dab49eea:f7afc201:7f49fd53


    --------------------------


    $ mdadm --detail --scan --verbose


    ARRAY /dev/md1 level=raid5 num-devices=4 metadata=1.2 name=ken:RAID5B UUID=401f4331:dab49eea:f7afc201:7f49fd53

    devices=/dev/sde,/dev/sdf,/dev/sdg,/dev/sdh

    INACTIVE-ARRAY /dev/md0 num-devices=3 metadata=1.2 name=ken:RAID5 UUID=1dddc47a:526443ca:c826044f:2ec26555

    devices=/dev/sda,/dev/sdc,/dev/sdd


    -------------------------------



    Surfing on your forums I discovered this other post: OMV Raid lost after a power hit with a similar problem. The difference is in the disk failure...


    Knowing our disk failure, we just replaced the disk with a new one and then, in the S.M.A.R.T. tab, everything appears green, but the RAID is yet not mounted and marked as missing...


    I put in the next answer to this post the same commands information but after the replacement of the faulty disk:


    In the related post that we found one of the moderators answered this:


    """

    Try stopping and reassembling;


    mdadm --stop /dev/md0


    mdadm --assemble --force --verbose /dev/md0 /dev/sd[bcde]


    no guarantee, but that usually works

    """


    We haven't tried this yet because we are not sure if the same command is applicable in our case with the faulty disk replaced OR if it's better to launch this command with only the 3 remaining disks to try to get a "deprecated" RAID state and after that, go further and add the new disk as we have done other times.


    What do you think? Any advice? We have a backup in other server but could be really time expensive to restore it... so if there is some way to recover the current one will be great!


    Thank you so much in advance for your support.


    Best regards,


    Carlos

  • The commands after the disk change.


    --------------------------------------


    $ cat /proc/mdstat


    Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]

    md0 : inactive sda[4](S) sdd[3](S) sdc[2](S)

    17581171464 blocks super 1.2


    md1 : active raid5 sdf[1] sdh[3] sde[0] sdg[2]

    17581171200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

    bitmap: 0/44 pages [0KB], 65536KB chunk


    unused devices: <none>


    --------------------------------------


    $ blkid


    /dev/sdg: UUID="401f4331-dab4-9eea-f7af-c2017f49fd53" UUID_SUB="3928383f-7392-07d7-ee8a-262bc47ca70c" LABEL="ken:RAID5B" TYPE="linux_raid_member"

    /dev/sdc: UUID="1dddc47a-5264-43ca-c826-044f2ec26555" UUID_SUB="a7d856ad-ac8f-64b4-010a-c8a7fb7cdf8a" LABEL="ken:RAID5" TYPE="linux_raid_member"

    /dev/sdd: UUID="1dddc47a-5264-43ca-c826-044f2ec26555" UUID_SUB="40ec61b7-7036-5895-0337-43168ec9158d" LABEL="ken:RAID5" TYPE="linux_raid_member"

    /dev/sda: UUID="1dddc47a-5264-43ca-c826-044f2ec26555" UUID_SUB="28cf8950-2cc0-60b7-06a4-c1a44c555039" LABEL="ken:RAID5" TYPE="linux_raid_member"

    /dev/sdh: UUID="401f4331-dab4-9eea-f7af-c2017f49fd53" UUID_SUB="93afa8c8-5c7f-d9e3-bcf4-303b61acea49" LABEL="ken:RAID5B" TYPE="linux_raid_member"

    /dev/sde: UUID="401f4331-dab4-9eea-f7af-c2017f49fd53" UUID_SUB="95d7874a-d5fe-5000-b76e-030779c6a37b" LABEL="ken:RAID5B" TYPE="linux_raid_member"

    /dev/sdf: UUID="401f4331-dab4-9eea-f7af-c2017f49fd53" UUID_SUB="b5bffb90-28ce-3736-a218-23abbce50a7d" LABEL="ken:RAID5B" TYPE="linux_raid_member"

    /dev/sdi1: UUID="a682f238-3c95-4f4d-a097-ab2b39bd86a2" TYPE="ext4" PARTUUID="56ab785c-01"

    /dev/sdi5: UUID="38631c0a-fb19-488c-a928-16b2370efa0b" TYPE="swap" PARTUUID="56ab785c-05"

    /dev/md1: LABEL="RAID5B" UUID="d1a508ab-7776-4b96-a7d6-a54fe8de1f09" TYPE="ext4"


    --------------------------------------


    $ fdisk -l | grep "Disk "


    Disk /dev/sdb: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68J

    Disk /dev/sdg: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdc: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdd: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sda: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68J

    Disk /dev/sdh: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sde: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdf: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors

    Disk model: WDC WD60EFAX-68S

    Disk /dev/sdi: 232,9 GiB, 250059350016 bytes, 488397168 sectors

    Disk model: Samsung SSD 850

    Disk identifier: 0x56ab785c

    Disk /dev/md1: 16,4 TiB, 18003119308800 bytes, 35162342400 sectors


    --------------------------------------


    $ cat /etc/mdadm/mdadm.conf


    # This file is auto-generated by openmediavault (https://www.openmediavault.org)

    # WARNING: Do not edit this file, your changes will get lost.


    # mdadm.conf

    #

    # Please refer to mdadm.conf(5) for information about this file.

    #


    # by default, scan all partitions (/proc/partitions) for MD superblocks.

    # alternatively, specify devices to scan, using wildcards if desired.

    # Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.

    # To avoid the auto-assembly of RAID devices a pattern that CAN'T match is

    # used if no RAID devices are configured.

    DEVICE partitions


    # auto-create devices with Debian standard permissions

    CREATE owner=root group=disk mode=0660 auto=yes


    # automatically tag new arrays as belonging to the local system

    HOMEHOST <system>

    # instruct the monitoring daemon where to send mail alerts

    MAILADDR XXX@XXX.com

    MAILFROM root


    # definitions of existing MD arrays

    ARRAY /dev/md0 metadata=1.2 name=ken:RAID5 UUID=1dddc47a:526443ca:c826044f:2ec26555

    ARRAY /dev/md1 metadata=1.2 name=ken:RAID5B UUID=401f4331:dab49eea:f7afc201:7f49fd53


    --------------------------------------


    $ mdadm --detail --scan --verbose


    ARRAY /dev/md1 level=raid5 num-devices=4 metadata=1.2 name=ken:RAID5B UUID=401f4331:dab49eea:f7afc201:7f49fd53

    devices=/dev/sde,/dev/sdf,/dev/sdg,/dev/sdh

    INACTIVE-ARRAY /dev/md0 num-devices=3 metadata=1.2 name=ken:RAID5 UUID=1dddc47a:526443ca:c826044f:2ec26555

    devices=/dev/sda,/dev/sdc,/dev/sdd

  • KM0201

    Approved the thread.
  • The commands after the disk change.

    There's a lot information in both posts :)


    Fortunately Fdisk is seeing the new drive, which appears to be /dev/sdb, so;


    Storage -> Disks, select the new drive and click wipe on the menu, select short, wait until completed.


    Adding the new drive should be done from the GUI, but first the array has to become active, so;


    mdadm --stop /dev/md0


    mdadm --assemble --force --verbose /dev/md0 /dev/sd[adc]


    This should get the array running in a clean/degraded state, if it displays as rebuilding in the GUI do nothing until it has completed, (but it shouldn't)


    Then Raid Management -> select the array, on the menu click Recover, a dialog box should display the new drive, select it and click OK, the array will now rebuild

    Raid is not a backup! Would you go skydiving without a parachute?

  • Hello geaves!


    Thanks for the quick response :D


    We follow the steps as you said and the RAID5 is rebuilding now as you specified. It's going to take about 12 hours... so tomorrow in the morning when we mount it again and check if all is in his place we update you here with a message.


    Thanks for your support.


    Best regards,


    Carlos

  • A thought. Looking at the serial numbers in the image in post 1, it looks like they were bought twice. A purchase of the same discs for one raid and another purchase of the same discs for the second raid. If one of the discs has failed, chances are that those that were purchased at the same time (same model, same batch, same hours of use) will start to fail in the chain. I'd monitor the smart values on those drives more regularly than usual.


    One solution to this is to mix the discs in different Raids. Or if you have a backup swap some disks if they are of the same capacity. That way they wouldn't fail at the same time.

    The best thanks to the help provided is to report what your solution was. The next one will thank you :thumbup:

  • Hi to everyone!


    All the process was well and we have been recovered all our information and functionallity.


    And "chente" thanks for the info, we didn't know this and we will take in mind in the future ;D.


    Thank you so much for all your support and the quick response guys! You're awesome!! :-)


    Best regards,


    Carlos

  • carloscg

    Added the Label resolved

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!