Degraded Array (RAID 10)

  • Hi All:


    Newbie to OMV. Set up a OMV server for a friend about 4 months ago.


    - 4, 2 Tb drives (brand new IronWolfs)

    - RAID 10


    OMV started reporting "A DegradedArray event had been detected on md device /dev/md0" Last month, with md0 being the array of course.


    All the disks are visible in Storage/Disks. All of the disks are reporting green in Storage/Smart/Devices. I ran a SMART test and did not see any noticeable errors.


    Storage/RAID Management shows the array as "clean, degraded" and only shows three of the four disks - with /dev/sde/ missing.


    When I select the array and hit "recover", /dev/sde/ is not visible. /sde/ is available if I attempt to add a file system.


    What can I do to get this drive back into the array? I've read through the forums and it was suggested to check the SATA cable as part of T/S-ing.

    I also read some of the commands to be run but where can I read which ones to use and leverage?

    Any help is appreciated!

  • Has the server been shutdown with a power loss and then just switched back on, have you tried rebooting the server, have you looked at the output from each of these along with the output from mdadm --detail /dev/md? where the ? is the raid reference i.e. 0, 127 etc,

  • Thanks geaves - Answers:


    1) There's a possibility the server was shut down from power loss but I have it on a small UPS so it should have shut down "gracefully" - NUT is installed and I have tested it.

    2) I have rebooted it since - no joy

    3) I have NOT looked at the items in the linked post - I will have to get over to the server and run those including the mdadm output.

    Will submit details soon.


    When I get hands-on with the server, does it make sense to swap out the SATA cable or move it to another SATA port? Seems like a longshot to me since the drive can be seen and assigned to a filesytsem.


    Much appreciated.

  • I'm somewhat confused by your post above and re reading your first post, I assume that the server is elsewhere, but do you have a remote connection to it, if you do, you can run those commands from where you are.


    To give you guidance;

    When I select the array and hit "recover", /dev/sde/ is not visible. /sde/ is available if I attempt to add a file system

    That's because the drive already has a raid signature on it, to run recover you would need to wipe the drive.

    I have rebooted it since - no joy

    Well, that was worth a try, but there has to be a reason why the drive was displaced and power loss is usually the cause

    does it make sense to swap out the SATA cable or move it to another SATA port? Seems like a longshot to me since the drive can be seen and assigned to a filesytsem.

    At present this would not be necessary and the output with the relevant information would help with a way forward.


    As a footnote running a raid in a degraded state is not a good idea it places further stress on the existing drives, if the drive fails that /dev/sde is mirrored too then the whole Raid 10 is toast.

    If it's a true Raid 10 you can afford to lose a drive in each mirror, losing one mirror (2 drives) you lose the Raid 10, and you're going to tell me you don't have a backup :)

  • Hi Greaves:

    Was able to run all the commands.


    I do not have remote access to the drive.


    Here is the output... I do not see anything out of the ordinary other than /dev/sde not being part of the array.


    As I said earlier, /dev/sde appears to be able to be added to other filesystems so the system sees it. But it is not recognized when I try and recover the array.


    All storage disks are Seagate Ironwolf NAS Drives – 2Tb


    I believe the array stopped working after a power outage - and its my strong belief this cause the drive to drop out. I do have the server on a UPS but apparently the user failed to tell me the battery had failed. Ordering a new AGM batt tonight I even have NUT running.


    I am aware of the details of RAID 10 and yes, I have a backup :)


    Can I assume I'll just need to wipe the orphaned drive then it will be available for re-add into the array?


    Thanks!




  • Yes, that's all you should need to do, use Recover on the raid menu to add the drive back.

    Just a quick update - Wiped and recovered but the first attempt failed. Drive fell back out of the array


    Tried a secure wipe and it started dumping DMA errors, etc.

    Shut down, moved the drive to a new SATA port (last one available)


    After reconfiguring the boot drive (!!), booted back up, wiped the drive and recovered in RAID.


    Sitting at 30%! Booya.


    I appreciate the guidance - had I know to simply wipe the drive and re-add I would have not looked like sucha noob.


    Still, much appreciated.

    Bill.

  • If this is not a drive issue then it's related to m'board.

    Agreed. The MoBo has 6 SATA ports and with the one failure, they're now all used. Should more fail, I'll either have to get a SATA card or swap out the MoBo.

    ---------


    On another note, after the array recovered, I started receiving a "SparesMissing event".


    I found the fix in the following thread - Remove the "SparesMissing event" - and will make the change when I get into the machine again.


    I take it this remains a bug with a fix still forthcoming?

  • On another note, after the array recovered, I started receiving a "SparesMissing event".

    Interesting never seen that before.

    I take it this remains a bug with a fix still forthcoming

    This is not related to OMV and has something to do with mdadm, for whatever reason it believes there is a spare available within the array.

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!