Help! Catastrophic failure

  • Hello everyone.


    Been running OMV for a while and loving it. Unfortunately, I've had a disaster today.


    I woke up this morning and the NAS was completely inaccessible via the web interface, SSH or Samba. I ungracefully shut down via the power button as I saw no other recourse.


    Upon rebooting the web gui was accessible (after a very long period of time), as well as SSH and Samba. From the webgui I could see many drives missing and two of my three RAID6 arrays degraded. I gracefully shutdown, removed all drives and reseated them and put them back into my Norco 4224 case. Powered back on and different drives missing now. One of the RAID6 arrays failed, one degraded and one still okay. The failed array is simply a Rsync backup of the degraded one, so no data lost yet, but I'm quite worried. Every time I reboot different drives are missing.


    My NAS is running in a Norco 4224 case with a supermicro motherboard and a xenon processor. I have an LSI controller with a HP SAS expander hooked into the Norco backplanes. I have my OMV NAS sending its logs to an external syslog server and have attached the OMV logs for the past two days.


    The array that is completely fine (for now) unfortunately only contains unimportant stuff that I don't care much about. The degraded array has my media library and other important stuff, and the back up has failed. So I'm sitting precariously right now, and worried that I will lose the rest.


    I'd be very grateful for any assistance and am more than happy to provide any extra information that may be deemed necessary.


    Thank you.

  • I've attached a pic of the drives screen. Only showing 21 drives out of 24. Last three boots had 8 out of 24, 23 out of 24, and now 21 out of 24. The activity lights for three of the drives are permanently on. I suspect those are the missing drives.


    My degraded array:


    Version : 1.2
    Creation Time : Sat May 17 21:21:59 2014
    Raid Level : raid6
    Array Size : 23442120704 (22356.15 GiB 24004.73 GB)
    Used Dev Size : 2930265088 (2794.52 GiB 3000.59 GB)
    Raid Devices : 10
    Total Devices : 9
    Persistence : Superblock is persistent


    Update Time : Sun Aug 31 16:21:41 2014
    State : clean, degraded
    Active Devices : 9
    Working Devices : 9
    Failed Devices : 0
    Spare Devices : 0


    Layout : left-symmetric
    Chunk Size : 512K


    Name : NAS:3TBDrivesRAID6 (local to host NAS)
    UUID : 8bee9f4b:f0329b66:686256ca:06ca7c9f
    Events : 121456


    Number Major Minor RaidDevice State
    0 8 144 0 active sync /dev/sdj
    13 8 64 1 active sync /dev/sde
    14 8 48 2 active sync /dev/sdd
    3 8 32 3 active sync /dev/sdc
    15 8 16 4 active sync /dev/sdb
    5 8 0 5 active sync /dev/sda
    6 8 96 6 active sync /dev/sdg
    7 0 0 7 removed
    11 8 80 8 active sync /dev/sdf
    12 8 160 9 active sync /dev/sdk


    The healthy array:


    Version : 1.2
    Creation Time : Fri Jun 20 12:19:55 2014
    Raid Level : raid6
    Array Size : 5860538880 (5589.05 GiB 6001.19 GB)
    Used Dev Size : 1953512960 (1863.02 GiB 2000.40 GB)
    Raid Devices : 5
    Total Devices : 5
    Persistence : Superblock is persistent


    Update Time : Sun Aug 31 16:23:22 2014
    State : clean
    Active Devices : 5
    Working Devices : 5
    Failed Devices : 0
    Spare Devices : 0


    Layout : left-symmetric
    Chunk Size : 512K


    Name : NAS:2TBDrivesRAID6 (local to host NAS)
    UUID : 895160c1:067ae348:30aea7e8:e1ea57de
    Events : 52476


    Number Major Minor RaidDevice State
    5 8 176 0 active sync /dev/sdl
    1 65 0 1 active sync /dev/sdq
    2 65 16 2 active sync /dev/sdr
    3 65 32 3 active sync /dev/sds
    4 65 48 4 active sync /dev/sdt


    The failed backup array is outright gone from the RAID menu now.

  • Do you happen to have another Linux System around? Shut your OpenMediaVault down gracefully and move one array with missing drives to a test system and check it there. I think if there are other disks missing each time it has more to do with your underlying hardware than your hard disks.


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • I don't have any spare machines with enough SATA or SAS plugs to hold the 8 drive failed array or the 10 drive degraded one.


    I have a spare Zotac nettop machine, and several Mediasonic 4 bay USB 3.0 enclosures. Would I be able to hobble something together for testing purposes with that? Does Openmediavault support USB3.0? Running the enclosures under USB2.0 with ten drives would be too slow for any kind of rebuild or repair, I think.


    I'm out of work at the moment -- not in the position to be buying new things, unless I absolutely have to (but will if I must).


    I've copied most of the absolutely irreplaceable files to my desktop machine, just in case.


    Thanks for the help, I appreciate it. I hope I can get back to stable.

  • I have a spare Zotac nettop machine, and several Mediasonic 4 bay USB 3.0 enclosures. Would I be able to hobble something together for testing purposes with that?


    Yes.


    Does Openmediavault support USB3.0?


    0.5 probably not, 1.0 should support it. But I would use something like systemrescuedisk or so. Also it's not about the cable beeing USB 3.0 and the device behind that but the port on your computer so put it in 2.0 just to be sure.


    Running the enclosures under USB2.0 with ten drives would be too slow for any kind of rebuild or repair, I think.


    I doubt that you even have to rebuild something.


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • Check mdadm data with it ;)


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • Okay, I have the nettop up and running the systemrescuecd with two Mediasonic enclosures containing the "failed" RAID6 array of 8 4TB drives.


    I'm nervous about doing any more damage. What shall I put on the command line to check the mdadm data?


    I'm sorry for the trouble.

  • Code
    mdadm --detail /dev/md/mdXXX
    cat /proc/mdstat


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • And cat /proc/mdstat ?


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • I'm stepping back from any further suggestions. Please wait a bit, @ryecoaaron is way more experienced with mdadm than I am and propably show by later...


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • XFS have some problem in your system


    I have XFS too , and what I do, is to boot from OMV install CD ( to avoid mount of my XFS system) and use XFS_repair as described here: [SOLVED] XFS Filesystem corrupt


    You need to check or identify if have some damages drives prior to do it, and repair RAID.

  • Thanks for the reply. The post you linked to looks like it could help later on. However, isn't xfs on top of the mdadm raid? In which case, doesn't the RAID need to be repaired first? And the RAID itself needs to have all the drives there -- and for whatever reason it appears my NAS is not recognizing some drives upon boot. Or is that logic wrong?

    • Offizieller Beitrag

    Try:


    mdadm --stop /dev/md127
    mdadm --assemble --force /dev/md127 /dev/sd[abcdefgh]


    Your logs show bad things. I have heard of the norco backplanes going bad before though.

    omv 7.1.0-2 sandworm | 64 bit | 6.8 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.13 | compose 7.2 | k8s 7.1.0-3 | cputemp 7.0.1 | mergerfs 7.0.5 | scripts 7.0.1


    omv-extras.org plugins source code and issue tracker - github - changelogs


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!