Cannot recover 3 disk RAID5 array with two working disks

  • Hello,


    I had a 3 Disk RAID5 array, a few weeks ago one of the disks failed and the array was happily working in a degraded state with the two remaining disks.


    Today I received a replacement disk and powered off the system. On turning the system back on my RAID array in the UI was missing. From the command line I can see the following...


    Bash
    $ cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md0 : inactive sdc[1](S) sdb[0](S)
    7813772976 blocks super 1.2
    
    
    unused devices: <none>


    Raid Level should be 5, not 0.


    Both disks seem okay, but for some reason, sdc is marked as spare.



    I have tried to reassemble the array but have no joy:

    Code
    $ sudo /sbin/mdadm /dev/md0 --assemble /dev/sd[bc]
    mdadm: /dev/sdb is busy - skipping
    mdadm: /dev/sdc is busy - skipping
    $ sudo /sbin/mdadm --stop /dev/md0
    mdadm: stopped /dev/md0
    $ sudo /sbin/mdadm /dev/md0 --assemble --force /dev/sd[bc]
    mdadm: /dev/md0 assembled from 1 drive and 1 spare - not enough to start the array.


    I am assuming the issue is the array being incorrectly marked as 0 or that sdc is marked as Spare and not Active - both I cannot find any way to resolve.


    Im not sure how best to proceed, I'm at the point of making a fresh array and recovering from backups - but as mdadm --examine /dev/sd[cb] seems to report both disks are fine I feel I should be able to recover the array in a degraded state and then add the replacement drive (sdd).


    Any guidance would be much appreciated, thank you!

  • chente

    Approved the thread.
  • I was able to bring back the array with the following commands:

    Bash
    $ sudo /sbin/mdadm --create --level=5 --layout=ls --chunk=512 --raid-devices=3 --assume-clean /dev/md0 /dev/sdb /dev/sdc missing
    mdadm: /dev/sdb appears to be part of a raid array:
           level=raid5 devices=3 ctime=Thu Sep 30 21:10:09 2021
    mdadm: /dev/sdc appears to be part of a raid array:
           level=raid5 devices=3 ctime=Thu Sep 30 21:10:09 2021
    Continue creating array? y
    mdadm: Defaulting to version 1.2 metadata
    mdadm: array /dev/md0 started.

    Data was also recovered. I got some direction from the following post, while the answer did not work for me, the second answer gave details on how to create a new array while trying to keep data, output from mdadm --examine /dev/sd[cb] gave me the details I needed.


    I then readded the new drive sdd via the UI and started recovering the array from its degraded state.


    I think the issue was due to sdc being marked as a spare, I don't think my actions were the best and there must be a less risky option so will leave this unresolved to see if anyone has better suggestions for next time.

    • Official Post

    I think the issue was due to sdc being marked as a spare

    Yes it was, this has come up on here before with no reason why and I've had no real world experience of this


    I don't think my actions were the best

    You were very, very lucky, the --assume-clean switch is an absolute last resort, this either works or destroys your data

    Raid is not a backup! Would you go skydiving without a parachute?


    OMV 6x amd64 running on an HP N54L Microserver

  • geaves yes it was, next step was to restore from backup so happy to try.


    Are you aware of any way to change the drive from spare to active?


    I believe the trigger for the issue was due to a power disruption to this drive, when I added the new drive it was not detected and I found some lose power cables to both these drives where causing the issue.

    • Official Post

    Are you aware of any way to change the drive from spare to active

    No none that I have found

    I believe the trigger for the issue was due to a power disruption to this drive,

    A lot of mdadm raid issues appear to stem from power issues, either an actual power loss or a hardware power loss, something must trigger mdadm into marking a drive as a spare when in actual fact the drive was active before the power failure/disruption.


    What you didn't post and what I forgot to ask for was in mdadm conf file (cat /etc/mdadm/mdadm.conf) that file should contain information regarding the array's setup. Did the information within that file under -> definition of arrays, contain --spares=1, if it did simply editing that file (which is not recommended) may have corrected the error.


    I've never experienced this in a real world scenario, because in commercial use we had a ups + backups, at home I would run backups/rsync every night, so a worst case situation I would restore from back up or even redeploy, as OMV can be set up and running in a day, it's the restore process that takes the time.


    BTW you don't have use /sbin to run mdadm commands :)

  • Quote

    Did the information within that file under -> definition of arrays, contain --spares=1, if it did simply editing that file (which is not recommended) may have corrected the error.

    I did check this file and remembered it all seems fine, I cannot remember if it noted there was a spare or not, thanks for this pointer.


    Quote

    because in commercial use we had a ups + backups, at home I would run backups/rsync every night, so a worst case situation I would restore from back up or even redeploy

    My only trouble was to save on backup costs I did not backup my media (TV, Film) collection as this was not important. Backups are nightly and I run on UPS, my mistake is poking around the inside while the server was powered on! :whistling:


    Quote

    BTW you don't have use /sbin to run mdadm commands :)

    For some reason, I have to reference the executable directly on my machine, no idea why, it is easier to do it than find out why I can't call the program directly in my session ;)


    Thanks for feedback, shame mdadm has no way to update the device role

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!