Harddrive Failure and Data Recovery

    • OMV 4.x

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • Harddrive Failure and Data Recovery

      This forum seems to be much more useful than the Snapraid forum, where I can't seem to find what I need (the navigation there is horrible compared to this forum).

      I have Snapraid with 4 drives. One 1TB parity and three 1TB data disks as follows:

      ParityDrive1
      Data2
      Data3
      Data4

      Recently the drive containing Data2 failed, and while I can see the Shared Folders, I can't get to anything in them. I have removed the bad drive (Data2), installed a new 2TB drive in its place (named 2TBData2sdb), tried running Fix in the GUI on that drive (as instructed). I have tried to follow the instructions in the "Snapraid in OMV User Guide" and the Snapraid Manual, but neither of these have produced results that allow me to see files. I have tried resetting permissions - no effect. There are references to copying data to an external USB drive - haven't been able to find guidance on that, and can't find how to even list files from the console. Have run Fix on the new drive, which the guide tells me to do in the GUI, but then it comes back with:

      Self test...
      Disks '/srv/dev-disk-by-label-sdcdisk3/' and '/srv/dev-disk-by-label-sdddisk4/' are on the same device.
      You can 'fix' anyway, using 'snapraid --force-device fix'.


      So I go to the console (finally figured out that it couldn't be done from the GUI) and ran "spanraid -D fix", which returns the following:
      "Self test ...
      DANGER! Ignoring that disks 'Data3' and 'Data4' are on the same device
      Disk '/srv/dev-disk-by-label-sdcdisk3/' and Parity '/srv/dev-disk-by-label-sdadisk1/snapraid.parity' are on the same device


      I have no idea how THAT happened because I originally dedicated disk1 to parity.

      When I run "Snapraid status" from the console, I get the following:
      Self test...
      WARNING! Content files on the same disk: '/srv/dev-disk-by-label-sdadisk1/snapraid.content' and '/srv/dev-disk-by-label-sdcdisk3/snapraid.content'
      WARNING! Content files on the same disk: '/srv/dev-disk-by-label-sdcdisk3/snapraid.content' and '/srv/dev-disk-by-label-sdddisk4/snapraid.content'
      Loading state from /srv/dev-disk-by-label-sdadisk1/snapraid.content...
      WARNING! Content file '/srv/dev-disk-by-label-sdadisk1/snapraid.content' not found, trying with another copy...

      Loading state from /srv/dev-disk-by-label-2TBData3/snapraid.content...

      WARNING! Content file '/srv/dev-disk-by-label-2TBData3/snapraid.content' not found, trying with another copy...

      Loading state from /srv/dev-disk-by-label-sdcdisk3/snapraid.content...
      WARNING! Content file '/srv/dev-disk-by-label-sdcdisk3/snapraid.content' not found, trying with another copy...

      Loading state from /srv/dev-disk-by-label-sdddisk4/snapraid.content...

      No content file found. Assuming empty.
      Using 0 MiB of memory for the FileSystem.
      SnapRAID status report:

      Files Fragmented Excess Wasted Used Free Use Name
      Files Fragments GB GB GB
      0 0 0 0.0 0 0 - 2TBData2sdb
      0 0 0 0.0 0 0 - Data3

      0 0 0 0.0 0 0 - Data4

      WARNING! Free space info will be valid after the first sync.
      The array is empty.


      I have no idea how my 2TBData2sdb became a label of 2TBData3. While I am hopeful that my files are still in tact somewhere, and that someone might be able to help me find them, I am a bit dismayed that I have lost everything. I don't know where to turn.

      Assuming my stuff is still in my OMV somewhere, can someone tell me how to:
      1. Find the Folders from a console command?
      2. Attempt to list the files within those folders?
      3. Copy files to an external USB drive?
      or
      4. How to fix this mess?

      I am lost ...
    • geaves wrote:

      There is a guide in the guides section which should be useful, I also found this on sourceforge which helped me understand it more.
      Thanks geaves. I have followed all the instructions in the guides section without success. Have been perusing sourceforge, but cannot find an answer. The OMV GUI apparently can't help me. I need to know how to state command line queries, but haven't yet developed linux command line skills. From the command line in root, I have been able to list directories whereby I found the sharedfolders directory, which I can then list and see all of my Shared Folders. But I don't know how to "see" the files within those shared folders. Don't know how to look at the snapraid config file. The guides and manual seem to be for experienced Linux users, of which I am not one ... learning, but ...
    • curious1 wrote:

      I have followed all the instructions in the guides section without success.
      TBH that is very straightforward and self explanatory, particularly when having to replace a failed drive, the sourceforge link is the same, provided a sync or scrub is not run during a drive change then recovery is straightforward. The guide pdf completes everything within omv's gui.
      I had an issue myself where snapraid scrub returned 52 file errors, the sync prior to that did not run (these are run as scheduled jobs) so I ran a fix from the gui, this didn't fix the errors but I was able to locate them. They were related to UrBackup, I fixed the errors and with the help of @crashtest was able to run a repair on UrBackup's db, reconfigured UrBackup now everything is working again.

      curious1 wrote:

      But I don't know how to "see" the files within those shared folders.
      You do this either from the command line by simply using cd so from root cd /sharedfolders this puts you in the sharedfolders directory, ls -l will list the content, lets say you have a Movie folder, so cd /sharedfolders/Movies ls-l will list the content. Or install Midnight Commander then from the command line run mc this will start Midnight Commander, this is a text based file manager/explorer, something I personally don't like so I use Cloud Commander in docker, or if you have a windows machine install WinSCP alternatively FTP such as FileZilla if you're on Linux or Mac.
      Raid is not a backup! Would you go skydiving without a parachute?
    • geaves wrote:

      You do this either from the command line by simply using cd so from root cd /sharedfolders this puts you in the sharedfolders directory, ls -l will list the content, lets say you have a Movie folder, so cd /sharedfolders/Movies ls-l will list the content.
      Thanks geaves. So I can cd sharedfolders and then list the folders. Of the several that I have I'll refer to Documents or Photos. I see that changing the directory to Documents requires it to be exactly in the Upper and lower case. But have yet to find and files. The 'ls -l' command keeps returning 0 files.

      sigh ...

      The post was edited 1 time, last by curious1 ().

    • curious1 wrote:

      Of the several that I have I'll refer to Documents or Photos. I can't cd to a folder
      So you can do cd /sharedfolders then ls -l, OK this is the out put of mine,

      Source Code

      1. root@homenas:/sharedfolders# ls -l
      2. total 88
      3. drwxrwsrwx+ 8 root users 4096 Oct 18 21:13 AppData
      4. drwxrwsrwx 7 root users 4096 Sep 17 06:36 AppDataBack
      5. drwxr-xr-x 2 root root 4096 Mar 20 2019 Movies
      6. drwxrwsrwx 387 root users 20480 Sep 8 12:12 MoviesBack
      7. drwxr-xr-x 2 root root 4096 Mar 20 2019 Music
      8. drwxrwsrwx 24 root users 16384 Apr 30 2019 MusicBack
      9. drwxr-xr-x 2 root root 4096 Sep 1 15:42 Photos
      10. drwxrwsrwx 5 99 users 4096 Sep 3 13:06 PhotosBack
      11. drwxr-xr-x 2 root root 4096 Mar 20 2019 PossibleMovies
      12. drwxr-xr-x 2 root root 4096 Mar 20 2019 Software
      13. drwxrwsrwx 10 root users 4096 Sep 18 08:07 SoftwareBack
      14. drwxr-xr-x 2 root root 4096 Sep 1 15:41 SuziFiles
      15. drwxrwsrwx 4 root users 4096 Sep 11 20:58 SuziFilesBack
      16. drwxr-xr-x 2 root root 4096 Mar 20 2019 TvShows
      17. drwxrwsrwx 8 root users 4096 Mar 21 2019 TvShowsBack
      Display All

      so now I cd into one to view;

      Source Code

      1. root@homenas:/sharedfolders# cd /sharedfolders/AppData
      2. root@homenas:/sharedfolders/AppData# ls -l
      3. total 160
      4. drwxrwsrwx+ 11 Emby users 4096 Apr 16 2019 Emby
      5. drwxrwsr-x+ 2 xxxxx users 4096 Oct 18 21:13 Glances
      6. drwxrwsr-x+ 7 admin users 4096 Nov 28 2018 Heimdall
      7. drwxrwsr-x+ 2 nobody users 139264 Nov 2 2018 Music-1
      8. drwxrwsr-x+ 5 xxxxx users 4096 Aug 23 14:12 Portainer
      9. drwxrwsr-x+ 2 xxxxx users 4096 May 2 12:22 WifiView
      any command is case sensitive, I couldn't get the second option work then discovered I was typing Appdata not AppData

      BTW are you using MergerFS + Snapraid, because if you are and your shares are on the fuse mergerfs mount point then the /sharedfolders will be empty, AppData is for docker configs and is on a separate drive.
      Raid is not a backup! Would you go skydiving without a parachute?
    • geaves wrote:

      BTW are you using MergerFS + Snapraid, because if you are and your shares are on the fuse mergerfs mount point then the /sharedfolders will be empty, AppData is for docker configs and is on a separate drive.
      I did discover that I can cd into /sharedfolders/Documents as long as I pay attention to the upper/lower case. But it still gives me 0 files. I am using a union under snapraid. Is that using MergeFS? If sharedfolders would then be empty, where would the files be?

      When I cd into sharedfolders, I can see the folders under ls -l (here are some but not all):
      drwxr-xr-x 2 root root 4096 Jul 5 2018 Documents
      drwxr-xr-x 2 root root 4096 Jul 5 2018 Movies
      drwxr-xr-x 2 root root 4096 Jul 5 2018 Music
      drwxr-xr-x 2 root root 4096 Jul 5 2018 Photos

      But as I think you are saying, these folders turn up zero files.

      What to do, what to do?

      The post was edited 1 time, last by curious1: additional details ().

    • @curious1
      WinSCP, what it is, a link to install it, etc., is in this guide. It has a Windows like file explorer interface that will let you poke around under the hood. Note the warnings in the guide. If files are deleted under root "/" there may be serious consequences. (And there's good info and notes on backup, both for the OS and data.)

      Here's what I'm hoping didn't happen in your case:
      A problem developed and, thereafter, a SYNC operation occurred. While SNAPRAID is a pseudo form of backup and it will let you restore a full disk, it's only as good as of the content of all drives, during the last SYNC. If drive issues were there and files were lost, then a SYNC occurred and the drive failed thereafter, the files lost before the last SYNC took place can't be recovered. That's a job for some form of internal or external backup, and just one reason why backup (as in a 2nd full copy of data) is a very good idea.

      The post was edited 4 times, last by crashtest: edit ().

    • Thanks geaves and crashtest.

      I have installed and run WinSCP onto my PC laptop which allows me to look at my OMV. I have perused everything under 'root' and found no content files for any of my sharedfolders. When I look at the snapraid.content log, this is what I get:

      # openmediavault Arrakis 4.1.8.2-1
      # and 'openmediavault-snapraid' 3.7.3

      block_size 256
      autosave 0
      #####################################################################
      # OMV-Name: ParityDrive1 Drive Label: sdadisk1
      content /srv/dev-disk-by-label-sdadisk1/snapraid.content
      parity /srv/dev-disk-by-label-sdadisk1/snapraid.parity

      #####################################################################
      # OMV-Name: Data2 Drive Label: 2TBData3
      content /srv/dev-disk-by-label-2TBData3/snapraid.content
      disk Data2 /srv/dev-disk-by-label-2TBData3

      #####################################################################
      # OMV-Name: Data3 Drive Label: sdcdisk3
      content /srv/dev-disk-by-label-sdcdisk3/snapraid.content
      disk Data3 /srv/dev-disk-by-label-sdcdisk3

      #####################################################################
      # OMV-Name: Data4 Drive Label: sdddisk4
      content /srv/dev-disk-by-label-sdddisk4/snapraid.content
      disk Data4 /srv/dev-disk-by-label-sdddisk4

      exclude *.bak
      exclude *.unrecoverable
      exclude /tmp/
      exclude lost+found/
      exclude .content
      exclude aquota.group
      exclude aquota.user
      exclude snapraid.conf*

      I don't know how to scope out a mount point. I cannot find a fuse mount point under /srv. I must be doing something wrong.
      Is '/srv/dev-disk-by-label-sdddisk4/snapraid.content' a mount point? Is WinSCP letting me look into the disks?


      I can't believe that I have lost ALL my files.

      Any other ideas or guidance? (thanks for your help so far)
    • Following image is from sometime ago before doing some upgrades;

      Note the Filesystem Type -> fuse.mergerfs and the Mount Point, the following is from WinSCP;

      You can clearly see the mount point from the omv image above, if you have set this up correctly all your shares should be under that mount point.
      Raid is not a backup! Would you go skydiving without a parachute?
    • crashtest wrote:

      Please show @curious1 what your MergerFS mount point looks like under Storage, Filesystems and an example path for looking at shared folders / files with WinSCP.
      Thanks guys. Here are the two items:



      The 2TBData3 is the 2TB drive I have attempted to replace the failed sdbdisk2, but I'm sure I screwed that up, even though I followed the guide instructions. How does this look to you?
    • curious1 wrote:

      The 2TBData3 is the 2TB drive I have attempted to replace the failed sdbdisk2, but I'm sure I screwed that up, even though I followed the guide instructions. How does this look to you?
      Like you've f* up, look at the filesystem image, you were attempting to replace sdbdisk2, what jumps out and slaps you in the face, compare that to 2TBData3
      Raid is not a backup! Would you go skydiving without a parachute?
    • geaves wrote:

      Like you've f* up, look at the filesystem image, you were attempting to replace sdbdisk2, what jumps out and slaps you in the face, compare that to 2TBData3
      Yeah, I know. I still have the bad drive. I thought I followed the guide properly, but obviously not. The main thing I see is that sdbdisk2 is btrfs and the 2TBData3 is ext4. I guess I followed the guide "too close to letter" because it instructed to use ext4, but gave no indication that you should match file systems.

      I still have the failed drive. And apparently haven't done anything (really) with the 2TBData3 drive. Am I completely hosed, or can I "rewind"?

      The post was edited 1 time, last by curious1 ().

    • curious1 wrote:

      Am I completely hosed, or can I "rewind"?
      Well at least you have spotted your own error and you should be able to rewind, but I would ask why btrfs, and if you read the guide keeping labels simple makes sense, mine are simply labelled disk1 through to disk4, then for snapraid parity and data1 through to data3.
      Raid is not a backup! Would you go skydiving without a parachute?
    • This happens a lot. Physically identifying the right (dead or dying) disk is crucial, or one might replace a good disk and compound the problem, as they lost track of their disks. With traditional RAID, going down a path like that can result in killing the array.

      All of your drives are unmounted and missing. How did that happen? I'm hoping you didn't take them all out and shuffle them around. Those drives need to be installed and/or remounted or you're not going to see anything, in the way of files, under the mount point.

      The disks attributes (and the serial number which is usually printed on the label) can be found Storage, Disks and in SMART attributes. Get your disks back in place, and in the proper order. Mount them, in File Systems.
      Find the bad drive. Replace that drive, in the same slot, with a new drive. Wipe the new drive (in Storage, Disks), format it with BTRFS (Storage, File systems). (In your case, you didn't do labels originally so I wouldn't apply a label to the new drive before it's restored.)

      Then go to the guide and do the SNAPRAID drive restore steps.
      __________________________________________________

      Your combined mergerfs mount point is:

      srv/300ee52a- etc., etc.

      This is where you'd look for your files and shares, using WinSCP, after all drives are mounted again.
      __________________________________________________

      The reason why the guide assumes that EXT4 is used, is because either EXT4 or XFS are the file systems most compatible (with the least issues) when using SNAPRAID. Obviously, EXT4 is the most common. In your case, since you have BTRFS on the original drives, that's what you'd use.
    • Thanks geaves and crashtest. You guys are really great.


      geaves wrote:

      mine are simply labelled disk1 through to disk4, then for snapraid parity and data1 through to data3
      I will go back to the simple naming convention when I apply your collective guidance.


      crashtest wrote:

      All of your drives are unmounted and missing. How did that happen? I'm hoping you didn't take them all out and shuffle them around. Those drives need to be installed and/or remounted or you're not going to see anything, in the way of files, under the mount point.
      I do not have any idea why my disks are unmounted and missing. I didn't do anything to cause that (that I know of - I am very careful not to be implementing things that I don't understand, present situation excepted). I did not rearrange the drives. I ran a boot diagnostic on the server which showed me which drive had failed, and then attempted the replacement of just that drive. And I notice the drives showing unmounted and missing, but wasn't sure what to do with that.

      crashtest wrote:

      Find the bad drive. Replace that drive, in the same slot, with a new drive. Wipe the new drive (in Storage, Disks), format it with BTRFS (Storage, File systems). (In your case, you didn't do labels originally so I wouldn't apply a label to the new drive before it's restored.
      The OMV Guide for Snapraid, in their example, talks about the bad drive "red" being replaced and the new drive being labeled "rednew", so that's why I chose the 2TBDisk2 name. So when I put the old/bad drive back in, remount everything, and then remove and replace the old/bad drive, do I simply name it the same? That would make sense to me on one level, but not sure that's what I'm supposed to do.

      I think you guys have given me some hope. I am out of town for a few days and away from my server, but I will be following your guidance when I return. I hope you'll keep an eye out for me in the middle of next week.

      Thanks so much!!! A thousand likes for both of you!!!!
    • I don't think you want to change any of the existing disk label names until after you are completely done with the recovery of the failed disk.
      --
      Google is your friend and Bob's your uncle!

      RAID - Its ability to disappoint is inversely proportional to the user's understanding of it.

      ASRock Rack C2550D4I - 16GB CC - Silverstone DS380
    • curious1 wrote:

      So when I put the old/bad drive back in, remount everything, and then remove and replace the old/bad drive, do I simply name it the same?
      From what I saw, you didn't name the original drive at all (no Label), so it picked up the device name for the SATA port. So, during or after formatting, I wouldn't give the new drive a label. But, to get it to work correctly (as in picking up the same device name), the new drive will need to be in the slot where the bad drive was.

      I would, remove the bad drive, install the new one in the same slot and boot up. Format the new drive with the same format as the old one, and try to mount all of them in the File System window. If they all mount, hopefully, the SNAPRAID drive restore process will work from there.
    • Users Online 1

      1 Guest