Harddrive Failure and Data Recovery

  • Thanks so much crashtest, greaves, and gderf for your attention to this. I will be returning to the server on Wednesday and I am hopeful that things will improve. And I will keep you informed.


    Thanks again!!

  • to get it to work correctly (as in picking up the same device name), the new drive will need to be in the slot where the bad drive was.


    I would, remove the bad drive, install the new one in the same slot and boot up. Format the new drive with the same format as the old one, and try to mount all of them in the File System window. If they all mount, hopefully, the SNAPRAID drive restore process will work from there.

    Well, I'm back and trying to implement your suggestions.


    1. Installed a fresh drive into the bad drive slot and rebooted.
    2. In File Systems, created the new drive with BTRFS & no label
    3. Took some doing, but Mounted it.
    4. None of the other drives will let me mount them, delete them, or do anything
    filesystems snapshot.png
    5. I have a monitor running at the command level, and the system continues to go berserk with ata errors.
    6. Ran 'snapraid -D fix' from command line, as it won't take run in the WebGUI - get the following message:
    Self test...
    Disks '/srv/dev-disk-by-label-2TBData3/' and '/srv/dev-disk-by-label-sdcdisk3/' are on the same device.
    You can 'fix' anyway, using 'snapraid --force-device fix'.

    7. When I run 'snapraid -D fix' from the command line, I get the following result:
    Self test ...
    DANGER! Ignoring that disks 'Data2" and 'Data3' are on the same device
    DANGER! Ignoring that disks 'Data2" and 'Data3' are on the same device
    Disk '/srv/dev-disk-by-label-2TBData3/' and Parity '/srv/dev-disk-by-label-sdadisk1/snapraid.parity' are on the same device.

    8. So far, Snapraid isn't restoring itself ... is there something I should do?


    In need of further guidance ...


    Thanks

  • First, since you didn't label your drives and the file system picked device names based on working ports, getting all of your good drives mounted, before mounting and formatting the new drive, would be a prerequisite.


    Second, the drives won't mount. In that case, nothing useful in SNAPRAID is going to work in your situation, because access to the Parity drive and at least one good copy of the content file is required to do anything related to restoration.
    ____________________________________________________________


    Note that we're not in beginner territory anymore and all I have left is speculation:


    - I don't think all drives went bad, simultaneously. (Statistically, that would be a long shot.)
    - You may have some kind of hardware issue, but the fact that you successfully mounted and formatted a drive "tends" to discount that somewhat. (Or, in any case, at least one SATA port is working in any case.)
    - Your OMV build, on the flash drive, may be hosed up. (This would be a good case scenario because at least it's fixable.)



    1. You could try using the following on the command line:
    mount /dev/sd?
    (Where ? is the last letter in the device name.)
    And see if one or more of those drives will mount.
    2. You could run the following on each drive to see what it's BTRFS status is:
    btrfs check /deb/sd?
    This works on unmounted drives. An exit code of "0" is good. Anything else indicates a failure.
    3. Set your current boot thumbdrive aside and build a new one. This is what I'd do. It wouldn't take long to build a new boot drive and, with a clean build in hand, a few essential questions could be answered. After the build is complete, see if you can mount your drives. If you can, you might find that your original drive is good. (I have no idea, one way or the other. Again, speculation)
    4. You could try booting up on a Knoppix live cd as a test. From there, see if your unmountable disks are actually mountable and if your data is still on them.


    ____________________________________________________________


    Anyone else with better ideas, please feel free to chime in.

  • If your boot drive is corrupted (I'm not saying it is) the damage is done. Generally speaking, fixing it would not be a good idea because the repair may not be *complete*. Some other issue may be hidden, to rear it's ugly head later on. It would be best to rebuild a second clean drive, if for nothing else but to test.
    (You could leave your SSD intact, by simply unplugging it.)
    ____________________________________


    In any case, this -> Guide is an OMV build walk through that covers building on flash media. It also covers installing PuTTY on a Windows Client, for getting on the command line by SSH, and a few things regarding WinSCP. (But most important of all, if you want to avoid an issue like this in the future, how to backup up the OS and setup a backup for your data.) Take a look. It may seem long but most of it is screen captures.


    Once you have OMV4 running on a USB stick and log in, change your password and extend the time out, but stop there. At that point, you could try mounting one of those BTRFS drives. If it mounts, use WinSCP to look over your files. Either way, post again with the results.

  • Once you have OMV4 running on a USB stick and log in, change your password and extend the time out, but stop there. At that point, you could try mounting one of those BTRFS drives. If it mounts, use WinSCP to look over your files. Either way, post again with the results.

    Thanks. I will definitely let you know what's happening.

  • Once you have OMV4 running on a USB stick and log in

    Just a quick update.


    I had some trouble getting the OMV4 bootable on a stick, but finally got there. Without drives attached, had Array errors and then it sent me to
    "mdadm: No arrays found in config file or automatically, then
    Gave up waiting for root file system device. Common problems:
    - Boot args (cat /proc/cmdline)
    - Check rootdelay= (did the system wait long enough?)
    - Missing modules (cat /proc/modules; ls /dev)
    ALERT! /dev/sdb1 does not exist. Dropping to a shell!"


    and then goes to "(initramfs)" prompt.


    If I attach a drive and then boot, I get a series of ATA erroros and then the following, which just keeps repeating:
    "mei_me 0000:00:16.0 less data available then length=00000000.
    mei_me 0000:00:16.0 less data available then length=00000001."


    I'm still working on getting the webGUI talking to the server with the boot coming from the USB stick. Pretty sure I can get that going (just a network thing I need to sort out).

  • Once you have OMV4 running on a USB stick and log in

    Just a quick update.


    I had some trouble getting the OMV4 bootable on a stick, but finally got there. Without drives attached, had Array errors and then it sent me to
    "mdadm: No arrays found in config file or automatically, then
    Gave up waiting for root file system device. Common problems:
    - Boot args (cat /proc/cmdline)
    - Check rootdelay= (did the system wait long enough?)
    - Missing modules (cat /proc/modules; ls /dev)
    ALERT! /dev/sdb1 does not exist. Dropping to a shell!"


    and then goes to "(initramfs)" prompt.


    If I attach a drive and then boot, I get a series of ATA erroros and then the following, which just keeps repeating:
    "mei_me 0000:00:16.0 less data available then length=00000000.
    mei_me 0000:00:16.0 less data available then length=00000001."


    I'm still working on getting the webGUI talking to the server with the boot coming from the USB stick. Pretty sure I can get that going (just a network thing I need to sort out).

  • Make sure your regular boot drive (the SSD) is not connected:
    If you built without data drives, without the data drives inserted, boot into OMV and on the command line run:


    grub update


    This should fix the initramfs issue. (If you need more info, or can't get to the command line, see this -> post.)


    Running the grub update command, then getting into the GUI is the first order of business.
    _________________________________________________


    Once you get into the GUI, shutdown and reinsert the data drives.


    The first thing I'd do, with the new install on the USB boot drive, is enable SMART in the GUI and look at disk attributes.


    Second, take a look at Diagnostics, System Logs, and Syslog. See if those ATA error messages are there.
    One (1) disk, with a malfunctioning SATA interface can cause serious and inexplicable problems.
    If so, you may have to go through a process of elimination, inserting one disk at a time and checking the log to see of one disk is responsible.

  • You guys are funny! Because I DEFINITELY feel like I'm stuck in that roundabout. Or like the one Chevy Chase was stuck in in National Lampoon's European Vacation. At any rate ...

    Once you get into the GUI, shutdown and reinsert the data drives.


    The first thing I'd do, with the new install on the USB boot drive, is enable SMART in the GUI and look at disk attributes.


    Second, take a look at Diagnostics, System Logs, and Syslog. See if those ATA error messages are there.
    One (1) disk, with a malfunctioning SATA interface can cause serious and inexplicable problems.
    If so, you may have to go through a process of elimination, inserting one disk at a time and checking the log to see of one disk is responsible.

    Have the GUI running. Shut it down and reinserted all of the data drives, including the parity drive because I am not sure which of the four drives was. I enabled SMART in the GUI on all drives it can see. I looked at the Diagnostics/System Logs and did not see any ATA error messages in any of the logs. But only two data drives are showing:
    Disk Recognition After USB boot.png


    And the same thing shows up in SMART:
    Disk SMART After USB boot.png


    The other two drives do not show up, unless the sdc and sdd drive is the parity drive. But their serial numbers do not show up.


    What's next??

  • Boot into the BIOS and see if the all the disks show there. If not, find out why.

    --
    Google is your friend and Bob's your uncle!


    OMV AMD64 5.x on ASRock Rack C2550D4I C0 Stepping - 16GB ECC - Silverstone DS380 + Silverstone DS380 DAS Box.

  • Even with the two disks showing, something "appears" to be going on with them. I'm not sure which attributes trigger the red light, but it's not good. You'd need to go into SMART, the DEVICES tab, click on a disk, then the INFORMATION button, and look at the ATTRIBUTES tab. But I'm not sure that I would trust it.


    @gderf has given good advice. If the disks are no-show in BIOS, there's a good chance there's a hardware problem of some kind. (The MOBO's SATA interface, maybe the power supply, etc.) That is speculation, but the chance of so many disks being dead at the same time and/or dying together, all at once, is statistically improbable.

  • You guys are funny! Because I DEFINITELY feel like I'm stuck in that roundabout. Or like the one Chevy Chase was stuck in in National Lampoon's European Vacation. At any rate ...

    Once you get into the GUI, shutdown and reinsert the data drives.


    The first thing I'd do, with the new install on the USB boot drive, is enable SMART in the GUI and look at disk attributes.


    Second, take a look at Diagnostics, System Logs, and Syslog. See if those ATA error messages are there.
    One (1) disk, with a malfunctioning SATA interface can cause serious and inexplicable problems.
    If so, you may have to go through a process of elimination, inserting one disk at a time and checking the log to see of one disk is responsible.

    Have the GUI running. Shut it down and reinserted all of the data drives, including the parity drive because I am not sure which of the four drives was. I enabled SMART in the GUI on all drives it can see. I looked at the Diagnostics/System Logs and did not see any ATA error messages in any of the logs. But only two data drives are showing:
    Disk Recognition After USB boot.png


    And the same thing shows up in SMART:
    Disk SMART After USB boot.png


    The other two drives do not show up, unless the sdc and sdd drive is the parity drive. But their serial numbers do not show up.


    What's next??

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!