MergerFS + SnapRAID - Am I missing something?

  • Hi,


    I'm doing some experiments with a OMV VM with mergerfs + snapraid. I created 4 x 60GiB virtual disk. Three of them were merged in one mergerfs disk with policy: Most free space. I configured snapraid with the same 3 data disk e 1 parity disk. Then I copied about 30 GiB of files inside the shared merged disk. Then I made the first sync. And now it's where I think I miss something about the mergerfs + snapraid using. I started to delete 1 file at time and with the command snapraid fix -m I successfully recovered that specific file. But when the deleted files become much more, there are many unrecoverable errors and not all deleted files are recoverable.

  • But are the merged 3x60 GB virtual disks the OP is using not considered one filesystem?

    Inwin MS04 case with 315 W PSU

    ASUS Prime H310i-Plus R2.0 board

    Two port PCI-E SATA card

    16GB Kingston DDR4

    Intel Pentium Coffee Lake G5400 CPU

    Samsung Evo M.2 256GB OS drive

    4x4TB WD Red NAS drives + 1x4TB + 1x5TB Seagate drives - MergerFS pool

    Seagate 5TB USB drives - SnapRAID parity x 2

  • Sure, but one parity disk would cover a three-disk pool. So not sure what trapexit was meaning as OP said he copied his files "inside the shared merged disk."

    Inwin MS04 case with 315 W PSU

    ASUS Prime H310i-Plus R2.0 board

    Two port PCI-E SATA card

    16GB Kingston DDR4

    Intel Pentium Coffee Lake G5400 CPU

    Samsung Evo M.2 256GB OS drive

    4x4TB WD Red NAS drives + 1x4TB + 1x5TB Seagate drives - MergerFS pool

    Seagate 5TB USB drives - SnapRAID parity x 2

  • Sure, but one parity disk would cover a three-disk pool. So not sure what trapexit was meaning as OP said he copied his files "inside the shared merged disk."

    SnapRAID doesn't look at mergerfs pools. It only looks at individual disks.

    --
    Google is your friend and Bob's your uncle!


    OMV AMD64 7.x on headless Chenbro NR12000 1U 1x 8m Quad Core E3-1220 3.1GHz 32GB ECC RAM.

    • Official Post

    Sure, but one parity disk would cover a three-disk pool. So not sure what trapexit was meaning as OP said he copied his files "inside the shared merged disk."


    What I think trapexit was getting at is:


    The key to using SnapRAID is understanding that it was designed for data that is "relatively" static. That is mentioned -> here. SnapRAID was NOT designed for high I/O environments or relational databases.


    Static data, which is distributed across more than one disk, is the basis for "on-demand" Parity calculations. As an example; if multiple files are deleted on a single disk, that is the same as a partial disk failure. With a single parity drive, the fix command can restore files on one drive, based on parity AND the data on the remaining "unaffected" disks.


    If lots of deletes happen on more than one disk, this has the same effect of simulating multiple partial disk failures. A single parity disk, alone, can not restore missing data across multiple disks. In the rebuilding process (along with parity); if data from one disk is needed to rebuild a file on another disk and that data is not present (it was deleted) that will result in an unrecoverable error.

    Note: Using tiny virtual drives AND large media files on those tiny drives will greatly increase the chances of an unrecoverable error in a fix operation, after deleting multiple files using Most Free Space. The chances of such an event happening with actual drives in the + terrabyte range is far less.


    The problem with the MergerFS Most Free Space directive, when used in conjunction with SnapRAID, is that Most Free Space scatters files across pooled disks. Several deleted files (that appear to be in the same folder) may actually be deleted from multiple physical disks. In a fix situation, this can appear to be a partial disk failure on more than one drive that may result in an unrecoverable SnapRAID error.


    Again, if data is mostly static, there's no problem. If normal sized drives are used, even with multiple deleted files across more than one disk in the pool, the chance of an unrecoverable error is very low.

    How to deal with this:
    If you want to delete a lot files and all else appears to be normal (no disk issues, no bad SMART stat's, etc.) run a sync command after a very large delete. Otherwise, you might think about using more than one parity drive.

  • OK thanks, a bit clearer. I my use case (config below) say I rip a CD and add to the mergerfs pool, the tracks could possibly be spread over the six drives in the pool. I do a sync every night. Next day I accidentally delete the ripped CD. Would my two partity setup be adequate to restore the files with snapraid fix?

    Inwin MS04 case with 315 W PSU

    ASUS Prime H310i-Plus R2.0 board

    Two port PCI-E SATA card

    16GB Kingston DDR4

    Intel Pentium Coffee Lake G5400 CPU

    Samsung Evo M.2 256GB OS drive

    4x4TB WD Red NAS drives + 1x4TB + 1x5TB Seagate drives - MergerFS pool

    Seagate 5TB USB drives - SnapRAID parity x 2

    • Official Post

    For others who may read this thread:

    I'm not sure if SnapRAID is calculating Parity based on hard drive sectors, filesystem inodes, or some other indexing method, but since the minimum parity drive size requirement is based on the largest protected hard drive size, from the file perspective, I believe Parity could line up with file sectors in a manner that may resemble the following graphic.

    The following is obviously an exaggeration because files are never of identical size but the point made is still valid. The MergerFS Most Free Space Directive will scatter files, that appear to be in the same shared folder, over all MergerFS pooled drives based on percentage of fill. Since static data on all drives is part of the parity calculation, doing large file deletes (several files) then attempting to recover them could result in the following:
    _________________________________________________________________________

    This depiction assumes several files have been deleted AND that the file sectors of the deleted files (scattered over pooled drives) line up in a manner as shown, AND that a recovery (snapraid fix) operation is immediately done to recover the deleted files. In the noted instances, there may not be enough static data, +parity, for recovery. In that case, there would be an unrecovered file. This would also apply to a full drive recovery but, again, the only files that that would be unrecoverable would be those that line up in a manner similar to what's shown.

    In the graphic immediately below file, F1 is recoverable. Files F3 and FB are unrecoverable.



    In the 2 Parity disks scenario (immediately above):
    Files F1,F3, and FB are recoverable. Files F4, F8 and FC would be lost.
    In both instances, if there was an actual hard drive failure the remainder of the files on the failed disk would be fully recovered.

    The reason why pa67 experienced the unrecoverable files issue is that (he/her/they) were using tiny virtual drives (60GB) AND doing large deletes of files which, I'm going to speculate, where probably large music files or enormous video files. This is not a realistic test scenario.

    With realistic sized hard drives (+ 500GB) and realistic deletes (usually a few files here and there), the chances of the above occurring become vanishingly small to nearly non-existent.

    _____________________________________________________________________________________________


    I my use case (config below) say I rip a CD and add to the mergerfs pool, the tracks could possibly be spread over the six drives in the pool.

    In your case, based on your signature, you're using 4TB drives. A set of compressed CD sound files (maybe 3MB total?), scattered over 4TB X 6 drives is extremely small. Adding to that you have two (2) Parity drives. This is what I meant by deleted file sectors aligning, in the above manner depicted above, being a vanishing small probability in a real world scenario. (Similar to a meteor taking out your house.)

    With that said, (with the MergerFS Most Free Space directive in place) if you deleted 4TB of scattered video files AND immediately tried to do a snapraid fix to recover them, you might not be able to recover some of those files. The output of the log file would tell you which files failed and, in that scenario, that's what backup is for.

    With the above noted, you're scrubbing frequently (every night). I hope you're using a good diff script. If not, the script that comes with the OMV6 SnapRAID plugin is pretty good and it's easy to configure.

  • Thanks, very useful. I sync nightly and every 5th do a 10% scrub using a helper script. Naturally all backed up to another device.

    Inwin MS04 case with 315 W PSU

    ASUS Prime H310i-Plus R2.0 board

    Two port PCI-E SATA card

    16GB Kingston DDR4

    Intel Pentium Coffee Lake G5400 CPU

    Samsung Evo M.2 256GB OS drive

    4x4TB WD Red NAS drives + 1x4TB + 1x5TB Seagate drives - MergerFS pool

    Seagate 5TB USB drives - SnapRAID parity x 2

  • If lots of deletes happen on more than one disk, this has the same effect of simulating multiple partial disk failures. A single parity disk, alone, can not restore missing data across multiple disks. In the rebuilding process (along with parity); if data from one disk is needed to rebuild a file on another disk and that data is not present (it was deleted) that will result in an unrecoverable error.

    Thank you for pointing this out. This is absolutely essential to understand when using mergerfs and Snapraid together. I had exactly that problem and only realised when it was too late. I had 5x500GB data disks and 2x 1TB Partiy disks. When I deleted c. 1TB of media files (all jpgs and mp3/4's) which were scattered on 3 HDs, I was only able to recover around 60% of the data. All syncs had run perfectly well beforehand so I did not understand why my data was not recoverable, even with 2 parityt disks. It seems to be related to the point you are describing. What this means is that the two packages should never be used together and especially not in mergerfs MostFreeSpace mode.

    OMV6 i5-based PC

    OMV6 on Raspberry Pi4

    OMV5 on ProLiant N54L (AMD CPU)

    • Official Post

    I had exactly that problem and only realised when it was too late. I had 5x500GB data disks and 2x 1TB Partiy disks. When I deleted c. 1TB of media files (all jpgs and mp3/4's) which were scattered on 3 HDs,

    Well, you'd have to admit that deleting 1TB of data from an available pool of 1.5TB of disk space, that is scattered across all disks in the pool, is not "normal". Multiple simultaneous disk failures, even partial failures, is an extremely unlikely event. What most people don't realize is, RAID5 couldn't deal with that either. Both forms of protection are geared toward whole disk failures, where the remaining healthy disks can be used to rebuild data.


    It seems to be related to the point you are describing. What this means is that the two packages should never be used together and especially not in mergerfs MostFreeSpace mode.

    You have a legitimate point but, as it seems, most users don't take the time to educate themselves on MergerFS storage policies. Existing path, most free space, is a good policy for "data type" consolidation but it creates storage imbalances for users with huge media stores (particularly where numerous video files are concerned.)


    In your case, at least SNAPRAID allowed for some recovery (60%) of your data store and it gave you an indication that error conditions existed. With RAID5, that would not be the case. Also, where error conditions are concerned, RAID5 is absolutely silent. Further, RAID5 can silently corrupt data with no indication whatsoever and, in some cases, without the top level filesystem detecting it. Where SNAPRAID shines is it's ability to maintain data integrity and it makes it possible to selectively recover files. RAID5 can't do either.


    In the bottom line, there is no substitute for 100% backup of your data store. While events like this are rare, as you know, they can happen. With full backup, recovery options are far less painful.

  • Thank you for pointing this out. This is absolutely essential to understand when using mergerfs and Snapraid together. I had exactly that problem and only realised when it was too late. I had 5x500GB data disks and 2x 1TB Partiy disks. When I deleted c. 1TB of media files (all jpgs and mp3/4's) which were scattered on 3 HDs, I was only able to recover around 60% of the data. All syncs had run perfectly well beforehand so I did not understand why my data was not recoverable, even with 2 parityt disks. It seems to be related to the point you are describing. What this means is that the two packages should never be used together and especially not in mergerfs MostFreeSpace mode.

    They can be used together, you just have to understand what you're doing.

    I use mergerfs in read only mode. The benefit is that a media player needs only mount one share to see all files and the user doesn't have to browse through multiple shares to find what they're looking for.

    When writing files to the NAS, I copy directly to the individual disk via the discrete samba shares. This ensures that I know which disk the files are on, and it is up to me to manage free space. After ensuring a successful write, I do a snapraid sync.

  • Well, you'd have to admit that deleting 1TB of data from an available pool of 1.5TB of disk space, that is scattered across all disks in the pool, is not "normal".

    Well I agree but it can easily happen to Nextcloud users. The MEDIA folder was mounted as persistent to a Docker container whereby the developers forgot to mention that any upgrade of the NC container will erase that mounted folder (incl all data). So its not so unusual that a large amount of data can get lost. And I assumed that Snapraid is protecting me which was wrong.

    OMV6 i5-based PC

    OMV6 on Raspberry Pi4

    OMV5 on ProLiant N54L (AMD CPU)

  • RAID5 couldn't deal with that either.

    I was actually in RAID6 using 2 parity disks and I still lost 40% of the data. I would say that is an unexpected result. But I am not sure how exactly Snapraid uses the 2 parity disks. The log files during recovery only show read attempts on 1 parity disk.

    OMV6 i5-based PC

    OMV6 on Raspberry Pi4

    OMV5 on ProLiant N54L (AMD CPU)

    • Official Post

    The MEDIA folder was mounted as persistent to a Docker container whereby the developers forgot to mention that any upgrade of the NC container will erase that mounted folder (incl all data)

    What!!?? 8| That doesn't seem like sound design to me. Docker containers can be torched and recreated in a matter of seconds. Data that containers collect or generate shouldn't be part of the create / destroy process. That would be a lot like OMV destroying all existing data, on data drives, as part of an OS rebuild. The OS and data drives are separated for a reason, to prevent data loss. For the exact same reasons, Dockers should be designed to do the same thing.

    So its not so unusual that a large amount of data can get lost.

    It should be unusual. There's no way to protect from this type of data disaster, on the local machine. Keep this in mind, without SNAPRAID, you would have recovered nothing. If you find a method that can 100% "regenerate" 1TB of lost data, from a 1.5TB pool of 3 drives, without using backup of some kind, I'd be really interested in what it is and how it's done.

    I was actually in RAID6 using 2 parity disks and I still lost 40% of the data. I would say that is an unexpected result.

    Mentioning RAID5 was for simplicity and feature comparison purposes. In your case, if you had only 1 parity drive (versus 2), the amount of recovered data would have been even less, with maybe around 30% recovered.
    ______________________________________________________________

    I don't use a Docker for media purposes, but:
    There must be some workaround method where the "Media folder" can be used as a transition location. As files are deposited in Media, maybe they could be rsync'ed out to a permanent location, then deleted from the Media folder. Otherwise, with the risk of something like this happening again, I'd be looking for another Docker.

  • What!!?? 8| That doesn't seem like sound design to me. Docker containers can be torched and recreated in a matter of seconds. Data that containers collect or generate shouldn't be part of the create / destroy process.

    The container can be torched without causing problems to the mounted volumes but the Nextcloud upgrade procedure is an automated process that deletes all folders inside the container, and then downloads new files into the container. I also think its not a very sound way to structure the upgrades but maybe there are constraints from the NC side.

    OMV6 i5-based PC

    OMV6 on Raspberry Pi4

    OMV5 on ProLiant N54L (AMD CPU)

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!