Degraded Array in raid1 after shutdown

  • Hello,


    I have this config in my NAS:


    • OMV 5.6.22 on 32GB USB stick
    • 2 SSD of ~ 230GB in RAID0
    • One RAID1 array with a 3rd SSD of ~ 470GB and the RAID0 array


    There was a power loss, but the UPS kept it cool. I triggered the shutdown through power button, and had the beep to confirm the soft shutdown triggered.

    Once started, I received an email with subject "DegradedArray event on /dev/md0:grange"




    /dev/md0 is still mounted and readable

    It seems I didn't loose data, though I'm using only ~ 40GB.


    IIUC, from mdstat:

    • md127 is missing from md0
    • sdb is "disabled" from md127


    I don't know what to do to restore a clean state.



    SMART tests look OK:





    blkid

    Code
    # blkid
    /dev/sr0: UUID="2007-09-30-21-03-00-0" LABEL="Photos_2006_2007" TYPE="iso9660"
    /dev/sdc: UUID="bb6dc4fa-6340-95e9-e456-8765c5bcf9ab" UUID_SUB="57e1c067-8015-5683-724f-8dc116859fcf" LABEL="grange:Deux230" TYPE="linux_raid_member"
    /dev/sdb: UUID="bb6dc4fa-6340-95e9-e456-8765c5bcf9ab" UUID_SUB="52c5b38d-1598-419c-e5d7-84dcdc2e5dd9" LABEL="grange:Deux230" TYPE="linux_raid_member"
    /dev/md127: UUID="b55fa23a-352e-6aa8-d591-105992535c4a" UUID_SUB="4df40837-4ae0-4b0d-d7d6-b363bb2554aa" LABEL="grange:0" TYPE="linux_raid_member"
    /dev/sda: UUID="b55fa23a-352e-6aa8-d591-105992535c4a" UUID_SUB="a647c19e-4599-f2d1-9c89-695f0addfdd6" LABEL="grange:0" TYPE="linux_raid_member"
    /dev/md0: LABEL="data" UUID="b207ff0e-7941-4359-89a4-3415d0928de3" TYPE="ext4"
    /dev/sdd1: UUID="5a787299-6e09-4939-9b4a-7765bcd5c689" TYPE="ext4" PARTUUID="a2b1c244-01"
    /dev/sdd5: UUID="7eae7069-fbbc-4a85-a085-a530abcdada9" TYPE="swap" PARTUUID="a2b1c244-05"


    fdisk -l | grep "Disk "



    /proc/mdstat

    Code
    # cat /proc/mdstat
    Personalities : [raid0] [raid1] [linear] [multipath] [raid6] [raid5] [raid4] [raid10] 
    md0 : active raid1 sda[1]
          488254464 blocks super 1.2 [2/1] [_U]
          bitmap: 2/4 pages [8KB], 65536KB chunk
    
    md127 : active raid0 sdc[1] sdb[0]
          493992960 blocks super 1.2 512k chunks


    /etc/mdadm/mdadm.conf


    mdadm --detail --scan --verbose

    Code
    # mdadm --detail --scan --verbose
    ARRAY /dev/md/grange:Deux230 level=raid0 num-devices=2 metadata=1.2 name=grange:Deux230 UUID=bb6dc4fa:634095e9:e4568765:c5bcf9ab
       devices=/dev/sdb,/dev/sdc
    ARRAY /dev/md0 level=raid1 num-devices=2 metadata=1.2 name=grange:0 UUID=b55fa23a:352e6aa8:d5911059:92535c4a
       devices=/dev/sda
    • Official Post

    I'm about to shut down but looking through the output I can't see anything wrong, added to that I've never seen anyone with this setup before I'm going to have to test this in a vm tomorrow.

    md127 is missing from md0

    It would be if one of the drives in that Raid0 died or went offline the Raid would be toast, so you would have to recreate it then re add it back to the Raid1 (md0), but the output, as I said does not suggest that.

    Raid is not a backup! Would you go skydiving without a parachute?


    OMV 6x amd64 running on an HP N54L Microserver

  • Hello geaves

    , but the output, as I said does not suggest that.



    Did you manage to test it as you wish ?

    Going back from holidays., I booted it up, and issue rose again.



    and examining both volumes lead to:



    md127 does not know it's not active in the raid1.

    mdadm --stop and --assemble seems dangerous and overkill to me.
    Maybe I can try --re-add ?


    Backup is done, ready to test :)

    • Official Post

    Did you manage to test it as you wish

    Didn't get the opportunity, but as I said from your first post the output from each cat /proc/mdstat shows both the arrays as active, but the output from the post above of --detail /md0 shows 2 devices but a total of 1. That means it still cannot locate /md127.


    TBH, this is first time I have ever come across this and to create it you would have done this from the cli, the setup is just weird!! Using mergerfs along with a backup would have been a better option.


    But to emphasise what I said previously if one single drive within a Raid 0 fails, is disabled, loses it's connection the array is toast, you can't bring it back, the only way forward is to recreate it then re add it to the Raid 1. You can try re add if you want, or assemble if you've backed up your data you've nothing to lose.

  • That was it.


    Code
    # mdadm /dev/md0 --add /dev/md127
    mdadm: re-added /dev/md127

    That looked good



    And now State is clean.


    Back to business, now :)


    geaves: your insights helped, thank you !

  • mat-m

    Added the Label resolved

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!