MergerFS pool becomes corrupted for an unknown reason

  • I've made a MergerFS pool that spans the entire size of 8 drives. In this pool is a single parent folder called "Share," under which all my storage and media files reside in sub-folders. This "Share" folder is mapped as an SMB share, and it is also mapped to a Docker storage container to which my other Docker containers have access.


    I did a fresh install of OMV 4.x about a month ago and recreated this setup. After about a week, I noticed that my NZBGet container was giving Unrar errors indicating that there was not enough free space to unpack downloads (Unrar error 5). Then, ruTorrent was giving hash errors, that when a new file was downloaded, the subsequent hash check was returning missing pieces (Hash check on download completion found bad chunks, consider using \"safe_sync\"). Eventually, I found that by deleting the MergerFS pool and recreating a new one resolved both problems. So, I did that, updated the mappings for Docker, and all was right in the world again. This lasted about two weeks, and once again, overnight, my NZBGet and ruTorrent downloads are failing with the same issues.


    I don't want to continue this cycle of having to delete and re-create a new pool every few weeks. I don't know if there's something that triggers this corruption. Any ideas on how to identify the problem? I have run fsck on all drives with no filesystem errors to be found thus far.

  • The recreation "fixing" the issue doesn't make sense. mergerfs doesn't interact with your data. It's a proxy / overlay.


    Could you provide the settings you're using? It's really not possible to comment further without such details.


    Are you using the drives out of band of mergerfs? Are drives filling?

  • The recreation "fixing" the issue doesn't make sense. mergerfs doesn't interact with your data. It's a proxy / overlay.


    Could you provide the settings you're using? It's really not possible to comment further without such details.


    Are you using the drives out of band of mergerfs? Are drives filling?


    Thanks for chiming in. It doesn't make sense to me, either, but I'm not sure what else could be the cause.


    The drives are used ONLY in context of the pool, nothing writes data to any of the individual drives. At this point, I am unable to download any new data to ensure that the pool is still filling per policy, but up to this point, yes, all data was written to the drives as I would expect based on the policy I had selected:


  • Unfortunately, there is nothing I can do without additional information. If it's saying you're out of space then something is returning that. The only time mergerfs explicitly returns ENOSPC is when all drives become filtered and at least one reason was minfreespace.


    Next time an error occurs please gather the information as mentioned in the docs or at least `df -h`.

  • Unfortunately, there is nothing I can do without additional information. If it's saying you're out of space then something is returning that. The only time mergerfs explicitly returns ENOSPC is when all drives become filtered and at least one reason was minfreespace.


    Next time an error occurs please gather the information as mentioned in the docs or at least `df -h`.


    I must be overlooking something because I can't seem to find instructions for capturing info in the mergerfs docs.


    Edit: For some reason, it appears as though data has begun accumulating on disk a4 exclusively instead of spreading across the disks as intended.


    Here is the output from df -h



  • I just deleted and re-created a new mergerfs pool again, and immediately after mapping everything to the new pool, data is able to be downloaded/unpacked via nzbget and hash checks no longer fail in rtorrent/rutorrent. The strange part is that it seems like the data was able to be downloaded, it was just the unpacking/verification stages that were failing.


    I have no doubt that there is some issue external to mergerfs that is causing this behavior, I just don't know where to begin.

  • Thank you @trapexit. I didn't want to waste your time pouring through my data if I found another culprit. However, so far, I've been unable to make any progress, so here we are:


    I am not sure how to run strace on a Docker container that's running and resulting in these errors. Any simple command I could run in terminal that could be traced instead?


    Code
    uname -a
    
    
    Linux ratsvault 4.19.0-0.bpo.1-amd64 #1 SMP Debian 4.19.12-1~bpo9+1 (2018-12-30) x86_64 GNU/Linux
    Code
    And the entry in fstab... 
    
    
    /srv/dev-disk-by-label-b1:/srv/dev-disk-by-label-b2:/srv/dev-disk-by-label-b3:/srv/dev-disk-by-label-b4:/srv/dev-disk-by-label-a1:/srv/dev-disk-by-label-a2:/srv/dev-disk-by-label-a3:/srv/dev-disk-by-label-a4 /srv/ceb94f6f-2407-4c37-9eb3-a737c3af08cf fuse.mergerfs defaults,allow_other,use_ino,dropcacheonclose=true,category.create=mfs,minfreespace=20G 0 0
  • Your post of the fstab entry is truncated. Don't use nano for this, try cat instead

    --
    Google is your friend and Bob's your uncle!


    OMV AMD64 7.x on headless Chenbro NR12000 1U 1x 8m Quad Core E3-1220 3.1GHz 32GB ECC RAM.

  • Your post of the fstab entry is truncated. Don't use nano for this, try cat instead


    Whoops, edited. And this wouldn't fit in my previous post.



  • This doesn't show your fstab mergerfs entry if that's what you were trying to show.

    --
    Google is your friend and Bob's your uncle!


    OMV AMD64 7.x on headless Chenbro NR12000 1U 1x 8m Quad Core E3-1220 3.1GHz 32GB ECC RAM.

  • @trapexit I played around in the terminal trying to figure out a command that would cause an error and had no luck. Moves, copies, unpacking, etc. all worked fine, and the issues continued to be experienced only within Docker containers.


    For the containers that were affected, I removed the mapping of a storage container and mapped the path instead, and it seems the problem may be resolved (at least for now, I have been downloading new media for a couple of days without issue). So either the Docker plugin isn't playing nice with MergerFS, or vice versa.


    This works:



    This doesn't:


  • Well, I spoke to soon, because NZBGet has begun reporting the "out of space" errors again. Will have to do some more digging.


    Edit: Upon more research, disk quotas seem to be the issue. I have never established any quotas since installing OMV, but I noticed NZBGet throwing an error regarding exceeding disk quotas. So, I disabled them using sudo quotaoff -a and re-ran a few downloads that had immediately before been unable to unpack. This time, all three separate downloads unpacked successfully. Is there a more permanent way to ensure that quotas do not get enabled automatically on a reboot or other system event? Can I simply remove the quota arguments in the mntent sections of the config.xml file?

    • Offizieller Beitrag

    disable quotas that will persist through a reboot?

    The quotas are applied to each drive though, not the mountpoint, but the only way to remove them is to edit mntent in /etc/openmediavault/config.xml then run omv-mkconf fstab

    Raid is not a backup! Would you go skydiving without a parachute?


    OMV 6x amd64 running on an HP N54L Microserver

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!