RAID 5 failure. All drives missing

  • Hi guys, I'm on OMV 4 and suddenly my raid set disappeared.
    I'm not an expert of Linux, just an happy user of OMV since few years.


    There is a disk with a red flag on SMART, from Disk section I can see all the drives but the raid is not recognised. In the file system section I can see the name of my raid set but no mounted (and I cannot mount it).
    Can you please advise where to start to try and recover the data?


    Thanks, M ;(

  • here we go:


    1. root@omv2:~# cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md127 : inactive sdb[3](S) sdc[0](S) sdd[1](S) sde[2](S)
    7813534048 blocks super 1.2


    unused devices: <none>


    2. root@omv2:~# blkid
    /dev/sda1: UUID="E179-5A61" TYPE="vfat" PARTUUID="6ffe10e0-0162-47d6-aebb-675329a11e86"
    /dev/sda2: UUID="6b814003-02eb-4ff4-8908-144e2ce1cf0e" TYPE="ext4" PARTUUID="54c1a535-e448-412c-8d36-79fc9b9decb0"
    /dev/sda3: UUID="30a77b0b-9afb-45ce-9499-0240f1fd7b0d" TYPE="swap" PARTUUID="9cca0d43-23f9-4a77-9868-3f1ed55590b7"
    /dev/sdc: UUID="e9d64950-32ce-bf3e-b3bd-b9a116e1feeb" UUID_SUB="692cc133-bbf1-fdf5-6f64-b94b90d57589" LABEL="omv2:raid5" TYPE="linux_raid_member"
    /dev/sdb: UUID="e9d64950-32ce-bf3e-b3bd-b9a116e1feeb" UUID_SUB="2824a33f-68bd-50d5-9551-b1136ad0b57b" LABEL="omv2:raid5" TYPE="linux_raid_member"
    /dev/sdd: UUID="e9d64950-32ce-bf3e-b3bd-b9a116e1feeb" UUID_SUB="8e294b42-ceca-0e72-b916-4b0178f5badf" LABEL="omv2:raid5" TYPE="linux_raid_member"
    /dev/sde: UUID="e9d64950-32ce-bf3e-b3bd-b9a116e1feeb" UUID_SUB="476e9ffa-8561-14c1-3709-b708bf9c1129" LABEL="omv2:raid5" TYPE="linux_raid_member"



    3. root@omv2:~# fdisk -l | grep "Disk "
    Disk /dev/sda: 111.8 GiB, 120040980480 bytes, 234455040 sectors
    Disk identifier: 8F69B466-3E4D-4DEA-A5A5-00FB29BEA2DD
    Disk /dev/sdc: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
    Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
    Disk /dev/sdd: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
    Disk /dev/sde: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors



    4. root@omv2:~# cat /etc/mdadm/mdadm.conf
    # mdadm.conf
    #
    # Please refer to mdadm.conf(5) for information about this file.
    #


    # by default, scan all partitions (/proc/partitions) for MD superblocks.
    # alternatively, specify devices to scan, using wildcards if desired.
    # Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.
    # To avoid the auto-assembly of RAID devices a pattern that CAN'T match is
    # used if no RAID devices are configured.
    DEVICE partitions


    # auto-create devices with Debian standard permissions
    CREATE owner=root group=disk mode=0660 auto=yes


    # automatically tag new arrays as belonging to the local system
    HOMEHOST <system>


    # definitions of existing MD arrays
    ARRAY /dev/md127 metadata=1.2 name=omv2:raid5 UUID=e9d64950:32cebf3e:b3bdb9a1:16e1feeb



    5. root@omv2:~# mdadm --detail --scan --verbose
    INACTIVE-ARRAY /dev/md127 num-devices=4 metadata=1.2 name=omv2:raid5 UUID=e9d64950:32cebf3e:b3bdb9a1:16e1feeb
    devices=/dev/sdb,/dev/sdc,/dev/sdd,/dev/sde




    OMV is on a 120 SSD
    RAID 5 set is built on 4 Seagate 2Tb each


    The system was stuck and I didn't know how to fix it so I reinstalled the whole system.
    After that the RAID was still working.
    Today I attached a USB drive to backup the data via rsync and during the copy the system was not working anymore (no files copied, the NAS was beeping every second).
    Switched off with the button and when on again the RAID was not recognised anymore.

  • the NAS was beeping every second).

    Do you know what can cause the NAS to beep? Is there a hardware problem beside the RAID maybe eg. the power supply? Maybe this was the reason that the NAS was stucked at first.


    A RAID do not like it to be unpowered unexpectedly. But your RAID is INACTIVE, not DEGRADED. That´s good.


    Maybe @geaves may chime in?


    In the meantime you may have a look at this article: How to fix linux mdadm inactive array

    OMV 3.0.99 (Gray style)
    ASRock Rack C2550D4I C0-stepping - 16GB ECC - 6x WD RED 3TB (ZFS 2x3 Striped RaidZ1)- Fractal Design Node 304

  • I tried with:
    root@omv2:~# mdadm --assemble --force --verbose /dev/md127 /dev/sd[bcde]
    mdadm: looking for devices for /dev/md127
    mdadm: No super block found on /dev/sde (Expected magic a92b4efc, got 00000000)
    mdadm: no RAID superblock on /dev/sde
    mdadm: /dev/sde has no superblock - assembly aborted
    root@omv2:~#


    I start getting worried for my data... :-(
    What shall I do?
    Unfortunately I'm not an expert of linux and shell...

  • I then switched off and on again. I needed to disconnect the faulty disk otherwise the system didn't start. At the second trial this is the result:


    mdadm: stopped /dev/md127
    root@omv2:~# mdadm --assemble --force --verbose /dev/md127 /dev/sd[bcd]
    mdadm: looking for devices for /dev/md127
    mdadm: /dev/sdb is identified as a member of /dev/md127, slot 3.
    mdadm: /dev/sdc is identified as a member of /dev/md127, slot 0.
    mdadm: /dev/sdd is identified as a member of /dev/md127, slot 1.
    mdadm: forcing event count in /dev/sdc(0) from 5494 upto 5851
    mdadm: added /dev/sdd to /dev/md127 as 1
    mdadm: no uptodate device for slot 2 of /dev/md127
    mdadm: added /dev/sdb to /dev/md127 as 3
    mdadm: added /dev/sdc to /dev/md127 as 0
    mdadm: /dev/md127 assembled from 3 drives - not enough to start the array.
    root@omv2:~#



    root@omv2:~# mdadm --examine /dev/sd*3
    mdadm: No md superblock detected on /dev/sda3.
    root@omv2:~#

  • It does, but when I start copying the dat out the system gen stuck and the raid disappear

    Did I say anything about copying the data off, one of the four drives is missing superblock, if that's the failing then it needs replacing and there is a procedure for doing that.


    I then switched off and on again. I needed to disconnect the faulty disk otherwise the system didn't start. At the second trial this is the

    Then you have probably lost your data!!!!


    Output of mdadm --detail /dev/md127

  • Thank you very much for your help.


    Disk has been re-connected again and after a while the system actually started.
    Now I mounted the 3 disks as you suggested.


    The output of the last is:
    root@omv2:~# mdadm --detail /dev/md127
    /dev/md127:
    Version : 1.2
    Creation Time : Tue Feb 4 09:31:04 2020
    Raid Level : raid5
    Array Size : 5860150272 (5588.67 GiB 6000.79 GB)
    Used Dev Size : 1953383424 (1862.89 GiB 2000.26 GB)
    Raid Devices : 4
    Total Devices : 3
    Persistence : Superblock is persistent


    Intent Bitmap : Internal


    Update Time : Sun Feb 16 06:47:34 2020
    State : clean, degraded
    Active Devices : 3
    Working Devices : 3
    Failed Devices : 0
    Spare Devices : 0


    Layout : left-symmetric
    Chunk Size : 512K


    Name : omv2:raid5 (local to host omv2)
    UUID : e9d64950:32cebf3e:b3bdb9a1:16e1feeb
    Events : 5853


    Number Major Minor RaidDevice State
    0 8 32 0 active sync /dev/sdc
    1 8 48 1 active sync /dev/sdd
    - 0 0 2 removed
    3 8 16 3 active sync /dev/sdb
    root@omv2:~#

  • Hi, now the 4 drives are all physically connected (only 3 mounted as raid) but one of them has the red flag on SMART. I have one similar that I can use to replace it.
    I’m not sure if relevant but the one faulty is connected to a PCI sata controller, the other drives directly connected into the motherboard. If I need to add one more drive it needs to go into the PCI controller though

  • Hi, now the 4 drives are all physically connected (only 3 mounted as raid) but one of them has the red flag on SMART

    If you going to give me information it has to detailed, contain references to the specific drives, in post 6 the raid could not be assembled with 4 drives due to a missing superblock on /dev/sde, this is confirmed by the output from post 11.


    So what's the output of lsblk and the output of mdadm --detail /dev/sde

  • here we go:


    1. root@omv2:~# lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sda 8:0 0 111.8G 0 disk
    |-sda1 8:1 0 512M 0 part /boot/efi
    |-sda2 8:2 0 109.6G 0 part /
    `-sda3 8:3 0 1.7G 0 part [SWAP]
    sdb 8:16 0 1.8T 0 disk
    `-md127 9:127 0 5.5T 0 raid5
    sdc 8:32 0 1.8T 0 disk
    `-md127 9:127 0 5.5T 0 raid5
    sdd 8:48 0 1.8T 0 disk
    `-md127 9:127 0 5.5T 0 raid5
    sde 8:64 0 1.8T 0 disk


    2. root@omv2:~# mdadm --detail /dev/sde
    mdadm: /dev/sde does not appear to be an md device

  • mdadm: /dev/sde does not appear to be an md device

    That's interesting, it seems because you have removed the drive and rebooted it's removed itself from the array ?(


    OK Storage -> Disks select /dev/sde, click wipe and from the dialog click short, this will wipe the drive, once completed, Raid Management -> Click on the Raid and select Recover from the menu, a dialog will come up which should show /dev/sde, select it and click ok, the array should now sync/rebuild with the added disk.


    If the above works come back when it's finished, do not use/access the array whilst the sync is running

  • Finished.



    The SMART is now Green for all disks



    The raid (in raid management) says that the raid is Clean and I can't see any worning but when I try to mount the RAID5 in filesystem there si an error "Failed to execute command 'export PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin; export LANG=C.UTF-8; mount -v --source '/dev/disk/by-label/RAID5' 2>&1' with exit code '32': mount: mount /dev/md127 on /srv/dev-disk-by-label-RAID5 failed: Structure needs cleaning".




    Thanks, M

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!