Harddrive Failure and Data Recovery

  • This forum seems to be much more useful than the Snapraid forum, where I can't seem to find what I need (the navigation there is horrible compared to this forum).


    I have Snapraid with 4 drives. One 1TB parity and three 1TB data disks as follows:


    ParityDrive1
    Data2
    Data3
    Data4


    Recently the drive containing Data2 failed, and while I can see the Shared Folders, I can't get to anything in them. I have removed the bad drive (Data2), installed a new 2TB drive in its place (named 2TBData2sdb), tried running Fix in the GUI on that drive (as instructed). I have tried to follow the instructions in the "Snapraid in OMV User Guide" and the Snapraid Manual, but neither of these have produced results that allow me to see files. I have tried resetting permissions - no effect. There are references to copying data to an external USB drive - haven't been able to find guidance on that, and can't find how to even list files from the console. Have run Fix on the new drive, which the guide tells me to do in the GUI, but then it comes back with:


    Self test...
    Disks '/srv/dev-disk-by-label-sdcdisk3/' and '/srv/dev-disk-by-label-sdddisk4/' are on the same device.
    You can 'fix' anyway, using 'snapraid --force-device fix'.


    So I go to the console (finally figured out that it couldn't be done from the GUI) and ran "spanraid -D fix", which returns the following:
    "Self test ...
    DANGER! Ignoring that disks 'Data3' and 'Data4' are on the same device
    Disk '/srv/dev-disk-by-label-sdcdisk3/' and Parity '/srv/dev-disk-by-label-sdadisk1/snapraid.parity' are on the same device


    I have no idea how THAT happened because I originally dedicated disk1 to parity.


    When I run "Snapraid status" from the console, I get the following:
    Self test...
    WARNING! Content files on the same disk: '/srv/dev-disk-by-label-sdadisk1/snapraid.content' and '/srv/dev-disk-by-label-sdcdisk3/snapraid.content'
    WARNING! Content files on the same disk: '/srv/dev-disk-by-label-sdcdisk3/snapraid.content' and '/srv/dev-disk-by-label-sdddisk4/snapraid.content'
    Loading state from /srv/dev-disk-by-label-sdadisk1/snapraid.content...
    WARNING! Content file '/srv/dev-disk-by-label-sdadisk1/snapraid.content' not found, trying with another copy...


    Loading state from /srv/dev-disk-by-label-2TBData3/snapraid.content...


    WARNING! Content file '/srv/dev-disk-by-label-2TBData3/snapraid.content' not found, trying with another copy...


    Loading state from /srv/dev-disk-by-label-sdcdisk3/snapraid.content...
    WARNING! Content file '/srv/dev-disk-by-label-sdcdisk3/snapraid.content' not found, trying with another copy...


    Loading state from /srv/dev-disk-by-label-sdddisk4/snapraid.content...


    No content file found. Assuming empty.
    Using 0 MiB of memory for the FileSystem.
    SnapRAID status report:


    Files Fragmented Excess Wasted Used Free Use Name
    Files Fragments GB GB GB
    0 0 0 0.0 0 0 - 2TBData2sdb
    0 0 0 0.0 0 0 - Data3


    0 0 0 0.0 0 0 - Data4


    WARNING! Free space info will be valid after the first sync.
    The array is empty.


    I have no idea how my 2TBData2sdb became a label of 2TBData3. While I am hopeful that my files are still in tact somewhere, and that someone might be able to help me find them, I am a bit dismayed that I have lost everything. I don't know where to turn.


    Assuming my stuff is still in my OMV somewhere, can someone tell me how to:
    1. Find the Folders from a console command?
    2. Attempt to list the files within those folders?
    3. Copy files to an external USB drive?
    or
    4. How to fix this mess?


    I am lost ...

  • There is a guide in the guides section which should be useful, I also found this on sourceforge which helped me understand it more.

    Thanks geaves. I have followed all the instructions in the guides section without success. Have been perusing sourceforge, but cannot find an answer. The OMV GUI apparently can't help me. I need to know how to state command line queries, but haven't yet developed linux command line skills. From the command line in root, I have been able to list directories whereby I found the sharedfolders directory, which I can then list and see all of my Shared Folders. But I don't know how to "see" the files within those shared folders. Don't know how to look at the snapraid config file. The guides and manual seem to be for experienced Linux users, of which I am not one ... learning, but ...

    • Offizieller Beitrag

    I have followed all the instructions in the guides section without success.

    TBH that is very straightforward and self explanatory, particularly when having to replace a failed drive, the sourceforge link is the same, provided a sync or scrub is not run during a drive change then recovery is straightforward. The guide pdf completes everything within omv's gui.
    I had an issue myself where snapraid scrub returned 52 file errors, the sync prior to that did not run (these are run as scheduled jobs) so I ran a fix from the gui, this didn't fix the errors but I was able to locate them. They were related to UrBackup, I fixed the errors and with the help of @crashtest was able to run a repair on UrBackup's db, reconfigured UrBackup now everything is working again.


    But I don't know how to "see" the files within those shared folders.

    You do this either from the command line by simply using cd so from root cd /sharedfolders this puts you in the sharedfolders directory, ls -l will list the content, lets say you have a Movie folder, so cd /sharedfolders/Movies ls-l will list the content. Or install Midnight Commander then from the command line run mc this will start Midnight Commander, this is a text based file manager/explorer, something I personally don't like so I use Cloud Commander in docker, or if you have a windows machine install WinSCP alternatively FTP such as FileZilla if you're on Linux or Mac.

  • You do this either from the command line by simply using cd so from root cd /sharedfolders this puts you in the sharedfolders directory, ls -l will list the content, lets say you have a Movie folder, so cd /sharedfolders/Movies ls-l will list the content.

    Thanks geaves. So I can cd sharedfolders and then list the folders. Of the several that I have I'll refer to Documents or Photos. I see that changing the directory to Documents requires it to be exactly in the Upper and lower case. But have yet to find and files. The 'ls -l' command keeps returning 0 files.


    sigh ...

    • Offizieller Beitrag

    Of the several that I have I'll refer to Documents or Photos. I can't cd to a folder

    So you can do cd /sharedfolders then ls -l, OK this is the out put of mine,


    so now I cd into one to view;


    Code
    root@homenas:/sharedfolders# cd /sharedfolders/AppData
    root@homenas:/sharedfolders/AppData# ls -l
    total 160
    drwxrwsrwx+ 11 Emby   users   4096 Apr 16  2019 Emby
    drwxrwsr-x+  2 xxxxx  users   4096 Oct 18 21:13 Glances
    drwxrwsr-x+  7 admin  users   4096 Nov 28  2018 Heimdall
    drwxrwsr-x+  2 nobody users 139264 Nov  2  2018 Music-1
    drwxrwsr-x+  5 xxxxx  users   4096 Aug 23 14:12 Portainer
    drwxrwsr-x+  2 xxxxx  users   4096 May  2 12:22 WifiView

    any command is case sensitive, I couldn't get the second option work then discovered I was typing Appdata not AppData


    BTW are you using MergerFS + Snapraid, because if you are and your shares are on the fuse mergerfs mount point then the /sharedfolders will be empty, AppData is for docker configs and is on a separate drive.

  • BTW are you using MergerFS + Snapraid, because if you are and your shares are on the fuse mergerfs mount point then the /sharedfolders will be empty, AppData is for docker configs and is on a separate drive.

    I did discover that I can cd into /sharedfolders/Documents as long as I pay attention to the upper/lower case. But it still gives me 0 files. I am using a union under snapraid. Is that using MergeFS? If sharedfolders would then be empty, where would the files be?


    When I cd into sharedfolders, I can see the folders under ls -l (here are some but not all):
    drwxr-xr-x 2 root root 4096 Jul 5 2018 Documents
    drwxr-xr-x 2 root root 4096 Jul 5 2018 Movies
    drwxr-xr-x 2 root root 4096 Jul 5 2018 Music
    drwxr-xr-x 2 root root 4096 Jul 5 2018 Photos


    But as I think you are saying, these folders turn up zero files.


    What to do, what to do?

    Einmal editiert, zuletzt von curious1 () aus folgendem Grund: additional details

    • Offizieller Beitrag

    @curious1
    WinSCP, what it is, a link to install it, etc., is in this guide. It has a Windows like file explorer interface that will let you poke around under the hood. Note the warnings in the guide. If files are deleted under root "/" there may be serious consequences. (And there's good info and notes on backup, both for the OS and data.)


    Here's what I'm hoping didn't happen in your case:
    A problem developed and, thereafter, a SYNC operation occurred. While SNAPRAID is a pseudo form of backup and it will let you restore a full disk, it's only as good as of the content of all drives, during the last SYNC. If drive issues were there and files were lost, then a SYNC occurred and the drive failed thereafter, the files lost before the last SYNC took place can't be recovered. That's a job for some form of internal or external backup, and just one reason why backup (as in a 2nd full copy of data) is a very good idea.

  • Thanks geaves and crashtest.


    I have installed and run WinSCP onto my PC laptop which allows me to look at my OMV. I have perused everything under 'root' and found no content files for any of my sharedfolders. When I look at the snapraid.content log, this is what I get:


    # openmediavault Arrakis 4.1.8.2-1
    # and 'openmediavault-snapraid' 3.7.3


    block_size 256
    autosave 0
    #####################################################################
    # OMV-Name: ParityDrive1 Drive Label: sdadisk1
    content /srv/dev-disk-by-label-sdadisk1/snapraid.content
    parity /srv/dev-disk-by-label-sdadisk1/snapraid.parity


    #####################################################################
    # OMV-Name: Data2 Drive Label: 2TBData3
    content /srv/dev-disk-by-label-2TBData3/snapraid.content
    disk Data2 /srv/dev-disk-by-label-2TBData3


    #####################################################################
    # OMV-Name: Data3 Drive Label: sdcdisk3
    content /srv/dev-disk-by-label-sdcdisk3/snapraid.content
    disk Data3 /srv/dev-disk-by-label-sdcdisk3


    #####################################################################
    # OMV-Name: Data4 Drive Label: sdddisk4
    content /srv/dev-disk-by-label-sdddisk4/snapraid.content
    disk Data4 /srv/dev-disk-by-label-sdddisk4


    exclude *.bak
    exclude *.unrecoverable
    exclude /tmp/
    exclude lost+found/
    exclude .content
    exclude aquota.group
    exclude aquota.user
    exclude snapraid.conf*


    I don't know how to scope out a mount point. I cannot find a fuse mount point under /srv. I must be doing something wrong.
    Is '/srv/dev-disk-by-label-sdddisk4/snapraid.content' a mount point? Is WinSCP letting me look into the disks?



    I can't believe that I have lost ALL my files.


    Any other ideas or guidance? (thanks for your help so far)

    • Offizieller Beitrag

    @geaves


    Please show @curious1 what your MergerFS mount point looks like under Storage, Filesystems and an example path for looking at shared folders / files with WinSCP. I'd do it but I'm using my collection of 2TB disks for BTRFS testing.
    Thanks.

    • Offizieller Beitrag

    Following image is from sometime ago before doing some upgrades;

    Note the Filesystem Type -> fuse.mergerfs and the Mount Point, the following is from WinSCP;

    You can clearly see the mount point from the omv image above, if you have set this up correctly all your shares should be under that mount point.

  • Please show @curious1 what your MergerFS mount point looks like under Storage, Filesystems and an example path for looking at shared folders / files with WinSCP.

    Thanks guys. Here are the two items:


    The 2TBData3 is the 2TB drive I have attempted to replace the failed sdbdisk2, but I'm sure I screwed that up, even though I followed the guide instructions. How does this look to you?

    • Offizieller Beitrag

    The 2TBData3 is the 2TB drive I have attempted to replace the failed sdbdisk2, but I'm sure I screwed that up, even though I followed the guide instructions. How does this look to you?

    Like you've f* up, look at the filesystem image, you were attempting to replace sdbdisk2, what jumps out and slaps you in the face, compare that to 2TBData3

  • Like you've f* up, look at the filesystem image, you were attempting to replace sdbdisk2, what jumps out and slaps you in the face, compare that to 2TBData3

    Yeah, I know. I still have the bad drive. I thought I followed the guide properly, but obviously not. The main thing I see is that sdbdisk2 is btrfs and the 2TBData3 is ext4. I guess I followed the guide "too close to letter" because it instructed to use ext4, but gave no indication that you should match file systems.


    I still have the failed drive. And apparently haven't done anything (really) with the 2TBData3 drive. Am I completely hosed, or can I "rewind"?

    • Offizieller Beitrag

    Am I completely hosed, or can I "rewind"?

    Well at least you have spotted your own error and you should be able to rewind, but I would ask why btrfs, and if you read the guide keeping labels simple makes sense, mine are simply labelled disk1 through to disk4, then for snapraid parity and data1 through to data3.

    • Offizieller Beitrag

    This happens a lot. Physically identifying the right (dead or dying) disk is crucial, or one might replace a good disk and compound the problem, as they lost track of their disks. With traditional RAID, going down a path like that can result in killing the array.


    All of your drives are unmounted and missing. How did that happen? I'm hoping you didn't take them all out and shuffle them around. Those drives need to be installed and/or remounted or you're not going to see anything, in the way of files, under the mount point.


    The disks attributes (and the serial number which is usually printed on the label) can be found Storage, Disks and in SMART attributes. Get your disks back in place, and in the proper order. Mount them, in File Systems.
    Find the bad drive. Replace that drive, in the same slot, with a new drive. Wipe the new drive (in Storage, Disks), format it with BTRFS (Storage, File systems). (In your case, you didn't do labels originally so I wouldn't apply a label to the new drive before it's restored.)


    Then go to the guide and do the SNAPRAID drive restore steps.
    __________________________________________________


    Your combined mergerfs mount point is:


    srv/300ee52a- etc., etc.


    This is where you'd look for your files and shares, using WinSCP, after all drives are mounted again.
    __________________________________________________


    The reason why the guide assumes that EXT4 is used, is because either EXT4 or XFS are the file systems most compatible (with the least issues) when using SNAPRAID. Obviously, EXT4 is the most common. In your case, since you have BTRFS on the original drives, that's what you'd use.

  • Thanks geaves and crashtest. You guys are really great.



    mine are simply labelled disk1 through to disk4, then for snapraid parity and data1 through to data3

    I will go back to the simple naming convention when I apply your collective guidance.



    All of your drives are unmounted and missing. How did that happen? I'm hoping you didn't take them all out and shuffle them around. Those drives need to be installed and/or remounted or you're not going to see anything, in the way of files, under the mount point.

    I do not have any idea why my disks are unmounted and missing. I didn't do anything to cause that (that I know of - I am very careful not to be implementing things that I don't understand, present situation excepted). I did not rearrange the drives. I ran a boot diagnostic on the server which showed me which drive had failed, and then attempted the replacement of just that drive. And I notice the drives showing unmounted and missing, but wasn't sure what to do with that.


    Find the bad drive. Replace that drive, in the same slot, with a new drive. Wipe the new drive (in Storage, Disks), format it with BTRFS (Storage, File systems). (In your case, you didn't do labels originally so I wouldn't apply a label to the new drive before it's restored.

    The OMV Guide for Snapraid, in their example, talks about the bad drive "red" being replaced and the new drive being labeled "rednew", so that's why I chose the 2TBDisk2 name. So when I put the old/bad drive back in, remount everything, and then remove and replace the old/bad drive, do I simply name it the same? That would make sense to me on one level, but not sure that's what I'm supposed to do.


    I think you guys have given me some hope. I am out of town for a few days and away from my server, but I will be following your guidance when I return. I hope you'll keep an eye out for me in the middle of next week.


    Thanks so much!!! A thousand likes for both of you!!!!

  • I don't think you want to change any of the existing disk label names until after you are completely done with the recovery of the failed disk.

    --
    Google is your friend and Bob's your uncle!


    OMV AMD64 7.x on headless Chenbro NR12000 1U 1x 8m Quad Core E3-1220 3.1GHz 32GB ECC RAM.

    • Offizieller Beitrag

    So when I put the old/bad drive back in, remount everything, and then remove and replace the old/bad drive, do I simply name it the same?

    From what I saw, you didn't name the original drive at all (no Label), so it picked up the device name for the SATA port. So, during or after formatting, I wouldn't give the new drive a label. But, to get it to work correctly (as in picking up the same device name), the new drive will need to be in the slot where the bad drive was.


    I would, remove the bad drive, install the new one in the same slot and boot up. Format the new drive with the same format as the old one, and try to mount all of them in the File System window. If they all mount, hopefully, the SNAPRAID drive restore process will work from there.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!