Accidentally corrupted ext4 Raid 6 partition tables

    • Major Upgrade

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • Accidentally corrupted ext4 Raid 6 partition tables

      Alright... I already really really hate myself for this, and it obviously ruined my holidays, so before everyone flames me for installing a new version of OMV without first unplugging all the hard drives, let me just get this out of the way... yes, I realize that's what I should have done... omg, yes... I regret it immensely. Yet please take pitty on me, and help someone that barely knows what he's doing get his life back.

      WD red 4TB drives x10, in a raid 6 configuration, was running OMV version 0.5.48

      I was told that the reason I couldn't use the last 7TB of space in the configuration was because this older version of OMV didn't support arrays bigger than 32TB (it doesn't matter if that's the case, the point is that I went ahead and tried to install the newest version of OMV on my usb stick... only instead of getting the selection screen to select onto which drive to install OMV, I guess my wireless keyboard stayed stuck on the enter key, and it proceeded to format the /dev/md127 (my actually raid) rather than the usb stick... I killed the system as soon as I realized that this was happening, but I know that in 8 seconds, a lot of damage can be done...)

      The system still recognizes the 10 drives as an array, superblock is persistent... but the filesystem is missing (I'm guessing the partition tables are messed up?)

      Where should I start? How do I get my life back? Please help. </3
      Images
      • raid is messed up.jpg

        107.16 kB, 1,076×386, viewed 104 times
      • raid missing.jpg

        86.18 kB, 981×483, viewed 98 times

      The post was edited 1 time, last by trillance ().

    • I don't know what I'm doing... but I hope this is what you wanted me to do? Looks really scary to me... no mention of my raid array in there.

      "sdi" is the USB stick I'm running everything off of. All hard drives in the system (10x4TB) are supposed to makeup an array as a singular xfs filesystem.

      (The second screenshot is the result of running # blkid)
      Images
      • df-h.jpg

        91.64 kB, 1,136×852, viewed 102 times
      • blkid.jpg

        124.53 kB, 1,136×639, viewed 88 times

      The post was edited 1 time, last by trillance ().

    • Ahhh!!!! This is so scarry!!!! At this rate, it would write 4TB before it is done? (My raid is only 40TB... 32TB without the redundancies... this seems like data much beyond the partition tables... ahhhh!! )

      How do I know it is resyncing towards the state we want, and not towards the formatted state that the installation initiated?

      Also, I forgot to mention... when the system was first booted after the accident, my friend said, ah! Just recover it from the webgui, and went ahead and clicked md127 as the recovery drive (when I think you're only supposed add new drives that way, when a drive fails, to replace a broken one)... The process was killed right away, but I wonder if that changes how the system perceives the raid array.

      I'm barely exaggerating when I say my life is on that server... isn't there a non-destructive way to simply extract my data, without risking making an error that would wipe all my work?

      I'm so sorry for panicking... You are clearly one of the most knowledgeable people here, and I should trust you... but since I really don't understand what's going on... would you mind explaining to me a bit what's going on,,, so I my nerves don't kill me over the next 440 minutes? ;(
      Images
      • IMG_1206-reduced.jpg

        538.62 kB, 3,264×2,448, viewed 71 times
    • trillance wrote:

      How do I know it is resyncing towards the state we want, and not towards the formatted state that the installation initiated?
      You should be able to access the data while it is resyncing. It will be slow though and cause the sync to be slower. You might have to mount the drive (mount -a)

      trillance wrote:

      The process was killed right away, but I wonder if that changes how the system perceives the raid array.
      Hard to say what that did.

      trillance wrote:

      I'm barely exaggerating when I say my life is on that server... isn't there a non-destructive way to simply extract my data, without risking making an error that would wipe all my work?
      I feel bad for people who don't backup their files. raid is not backup. It is meant to allow you to keep working when up to two drives fail (in your case). Like I said before, you should be able to access everything while it is resyncing. What I told you to try is much less dangerous than what you did in the web interface.

      trillance wrote:

      You are clearly one of the most knowledgeable people here, and I should trust you... but since I really don't understand what's going on... would you mind explaining to me a bit what's going on,,, so I my nerves don't kill me over the next 440 minutes?
      See above. No need to wait for 440 minutes. Once it is done resyncing, make a backup please.
      omv 4.0.6 arrakis | 64 bit | 4.12 backports kernel | omvextrasorg 4.1.0
      omv-extras.org plugins source code and issue tracker - github.com/OpenMediaVault-Plugin-Developers

      Please don't PM for support... Too many PMs!
    • I, at least through ftp or windows explorer, couldn't access the drive at all... also, when I use the web gui, it still said the filesystem was missing...

      from the server, I couldn't change folder to /dev/md127 (<- I think that's how I would see the content straight from the server?)

      My biggest concern is how the array seems to suggest that only 4TB of it was used... when I have some 22TB of data on there ... I really don't understand what's going on. :-S Shouldn't I have to rebuild the partition tables, so that the filesystem becomes recognized again?
      Images
      • server status.jpg

        154.9 kB, 1,421×749, viewed 64 times
    • trillance wrote:

      from the server, I couldn't change folder to /dev/md127
      Nope, can't do that. You need to access it via its mount point which you can find in /etc/fstab.

      trillance wrote:

      Shouldn't I have to rebuild the partition tables, so that the filesystem becomes recognized again?
      You have to fix the array first. Did it finish re-syncing? cat /proc/mdstat
      omv 4.0.6 arrakis | 64 bit | 4.12 backports kernel | omvextrasorg 4.1.0
      omv-extras.org plugins source code and issue tracker - github.com/OpenMediaVault-Plugin-Developers

      Please don't PM for support... Too many PMs!
    • trillance wrote:

      I should read the /etc/fstab config file to figure out where the mount point would be?
      Post the output of blkid and cat /etc/fstab

      trillance wrote:

      And I'm still super nervous about the 4TB sync size... any idea why it's sync-ing only 4TB worth of data?
      Where are you seeing that it is only syncing 4TB of data?
      omv 4.0.6 arrakis | 64 bit | 4.12 backports kernel | omvextrasorg 4.1.0
      omv-extras.org plugins source code and issue tracker - github.com/OpenMediaVault-Plugin-Developers

      Please don't PM for support... Too many PMs!
    • trillance wrote:

      well, 440mins x 60seconds x 150mb/sec is just short of 4TB? I'm basing that on the first image showing it sync-ing... also, I'm reading the progress (#/total) indicator as 4TB?
      It may only have to sync one drive. mdadm knows nothing about data on the drives. So, this output shouldn't worry you as long as it is in progress or complete.
      omv 4.0.6 arrakis | 64 bit | 4.12 backports kernel | omvextrasorg 4.1.0
      omv-extras.org plugins source code and issue tracker - github.com/OpenMediaVault-Plugin-Developers

      Please don't PM for support... Too many PMs!