RAID gone out of nowhere

  • Hi guys,


    I was using my NAS today to download a few Steam games and out of a sudden Steam reported disk errors. I figured out all network drives where gone, so I checked OMV via console and my RAID was gone. I did't do any update or change at all and can't figure out the reason.


    The file system section is showing this:



    I have no idea why out of the blue my whole RAID is gone. Can anybody help me to recover it?


    Thanks

    Courtyard

  • chente


    Here are my values:


    • Intel i5-35705 @3.10 GHz
    • 8 GB RAM
    • HDD Rack with 4/5 slots used
    • 4 HDD drives @5,5TB


    I have recognized the thread with the commands. I am not familiar with putty, is there a way to execute CLI commands via the GUI? I remember there has been an addon to do so.


    OK, ich tried with a ascreenshot:


  • chente geaves Thank you for helping out, I managed to access the NAS via putty. Here the results:


    • Offizieller Beitrag

    geaves will help you with this when he reads it. His last name is Dr_Raid. :)

    • Offizieller Beitrag

    mdadm --stop /dev/md127


    mdadm --assemble --force --verbose /dev/md127 /dev/sd[bcde]


    any errors post in full please, but it should be ok


    that should reassemble the array, you may have to mount it in file system once the rebuild has completed or just reboot, but it may come back up without intervention

  • geaves Here is the output:


    • Offizieller Beitrag

    Looking at that output it's not recoverable, even though it's it's marked as clean, it's clean because 2 drives are registered as part of the array and 2 as spares, and I'm guessing you don't have a backup.


    Run this on each drive, mdadm --examine /dev/sd? replace the ? with the drives reference e.g. /dev/sdb etc. post in separate code boxes need to see event count

  • geaves : I have a backup, it's not very recent, but the most important data is save, the loss would be not dramatically. Actually it's more a organizational matter since I have downloaded many indies games, a bit music and stuff. I lose a few photos, well, can't change it.


    But I don't get it... Why?


    Anyway, I'm going to deliver the data...

  • geaves : Here are the results:


    Code
    root@nas13:~# mdadm --examine /dev/sda1
    /dev/sda1:
       MBR Magic : aa55
    Code
    root@nas13:~# mdadm --examine /dev/sda2
    mdadm: No md superblock detected on /dev/sda2.
    Code
    root@nas13:~# mdadm --examine /dev/sda3
    mdadm: No md superblock detected on /dev/sda3.
    Code
    root@nas13:~# mdadm --examine /dev/sdb
    /dev/sdb:
       MBR Magic : aa55
    Partition[0] :   3907029166 sectors at            1 (type ee)
    Code
    root@nas13:~# mdadm --examine /dev/sdb1
    mdadm: No md superblock detected on /dev/sdb1.
    Code
    root@nas13:~# mdadm --examine /dev/sda
    /dev/sda:
       MBR Magic : aa55
    Partition[0] :    937703087 sectors at            1 (type ee)
    root@nas13:~# mdadm --examine /dev/sda1
    /dev/sda1:
       MBR Magic : aa55
    Code
    root@nas13:~# mdadm --examine /dev/sdg
    mdadm: cannot open /dev/sdg: No medium found
    • Offizieller Beitrag

    Ok you've restarted the server as the drive references have changed


    The drives in the array are currently [cdef] sdd and sdf are showing as spares, this is one of the issues, the second is #9 the error -> no uptodate device for slot 1 and the same for slot 3


    Going back to your initial post where you stated Steam reported disk errors, this could be i/o errors which could be related to hardware issues where the data being copied is experiencing intermittent write problems to the drive/s.

    This could be either the drive/s themselves or the connectivity of the drive, sata cable, power cable, backplane (backplane is a drive bay where a drive is plugged in i.e. 4 port drive bay these typically have/use a backplane.


    Can you give some info in relation to how the drives are connected, this might be a hardware issue.

    But I don't get it... Why

    If this is hardware related it would explain why it 'just happened' if it's drive related one would have expected SMART issues warning of a possible problem

  • geaves : Yes, I had shut down the NAS because it was not operating properly so I decided to turn it off in order to prevent any more issues.


    OK, can I somehow change the the drive's order, so it can relocate the correct drives again?


    The 4 HDDs are stored in a Sharkoon 5 drive bay box (this one)and connected to the PC (HP Mini desktop, Elite 8300 Mini PC) via USB. Smart values for all drives used to be good.

    • Offizieller Beitrag

    The 4 HDDs are stored in a Sharkoon 5 drive bay box (this one)

    That's why I asked my first question in this thread. That Sharkoon box uses this chip. https://www.jmicron.com/products/list/16

    That chip has a port multiplier. There is a lot of information on the Internet about the problems caused by configuring a Raid on disks connected through a port multiplier.

    • Offizieller Beitrag

    My fault

    I don't think it's your fault. I think this should be noted somewhere in the documentation and I don't remember ever seeing it, although maybe there is a reference somewhere.

    So probably the Sharkoon box was no good idea forstoring my drives

    No, it was not.

    I just have had luck all the years, until now

    Yes I think so.

  • The 4 HDDs are stored in a Sharkoon 5 drive bay box (this one)and connected to the PC (HP Mini desktop, Elite 8300 Mini PC) via USB.

    That chip has a port multiplier. There is a lot of information on the Internet about the problems caused by configuring a Raid on disks connected through a port multiplier.

    It all comes down to how the RAID was created:
    If by the box Hardware or OMV GUI.


    If by the Hardware and it's gone bad, only a similar Hardware RAID will be able to recover it.


    If the box was setted as JBOD then, all drives were seen as individuals on OMV and a software RAID was used.


    Either way, the RAID was on a USB bus and this is BAD.

  • chente : Is there any chance to at least revive the filesystem to read-only state for getting the data saved? In question the loss would be no disaster for me, but sad. Most stuff is backed up, the data I didn't back up recently can be recovered anyway (indie and Steam games, bit of music) and the rest is lost (photos). If there is a slightly chance, but it sounds more like it's a drag completely.


    Soma : The RAID was created via software by OMV gui.


    Either way I will switch to a more secure solution and the Sharkoon box + the HP PC will be sorted out.

    • Offizieller Beitrag

    It all comes down to how the RAID was created:
    If by the box Hardware or OMV GUI.

    The port multiplier chip causes problems in any Raid regardless of how it was created. This link is old, but illustrates it. https://www.zdnet.com/article/…ta-port-multipliers-safe/

    There is more information on the internet if you search a little. The Unraid and Truenas forums are full of information about it. In this forum we just tell people not to use Raid, etc, etc...

    I am increasingly convinced that many of the problems we see in this forum regarding Raid failures are related to a port multiplier chip.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!