Config
Hardware
- Machine: RockPro64 4GB
- Drives: I have two WB Red 4T (cmr) and a 6T (smr) ext4 formatted.
Software
- SnapRAID: The 6T is a SnapRAID parity drive to the 4T data/content drives.
- UnionFS: The 4Ts are combined into UnionFS using Most Free Space and these options: defaults,allow_other,direct_io,noforget,use_ino
The problem
Whenever I try to read from my unionfs filesystem, it does what I am calling a dirty unmount. Whenever I try to do an ls on an NFS mount (if I even get it to NFS mount, that is), or an rsync listing, or even just ls on the /srv entry, I get this:
# ls -lh /srv
ls: cannot access '/srv/a0ea2e22-75eb-48b4-91de-581e48f9185b': Transport endpoint is not connected
total 292K
drwx------ 2 root root 4.0K Apr 19 21:57 0704abaf-3022-42f5-8463-164439c29626
drwx------ 2 root root 4.0K Apr 19 22:32 3123f3df-6a19-4016-85b1-be04b8f707d9
drwx------ 2 root root 4.0K Apr 18 15:14 606f6613-657f-438c-8814-057bf2656bc3
drwx------ 2 root root 4.0K Apr 18 17:06 896654c1-0327-4a25-9640-19c45acc8399
d????????? ? ? ? ? ? a0ea2e22-75eb-48b4-91de-581e48f9185b
drwxrwxrwx 1 root root 256K Dec 31 1969 dev-disk-by-label-BACKUPS
drwxr-xr-x 5 root root 4.0K Apr 22 00:47 dev-disk-by-label-fedaykin1
drwxr-xr-x 5 root root 4.0K Apr 22 00:47 dev-disk-by-label-fedaykin2
drwxr-xr-x 3 root root 4.0K Apr 21 15:14 dev-disk-by-label-naib1
drwx------ 2 root root 4.0K Apr 19 08:30 df0502bc-286b-4c2f-9888-4caae260f710
drwxr-xr-x 2 ftp nogroup 4.0K Aug 28 2019 ftp
Alles anzeigen
rsync listing error
# rsync http://localhost/share
rsync: readdir("." (in share)): Software caused connection abort (103)
drwxrwsrwx 4,096 2020/04/21 22:10:41 .
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1668) [generator=3.1.2]
an ls
# ls -lh /srv/a0ea2e22-75eb-48b4-91de-581e48f9185b/
ls: reading directory '/srv/a0ea2e22-75eb-48b4-91de-581e48f9185b/': Software caused connection abort
total 0
rebooting is the only way I have found to effectively correct this state. doing an fusermount -u / mount does not seem to work well.
Possibly related symptom
Whenever I try to list the contents of any shared folders on the unionis via the CLI, it comes back with nothing. No errors, just a blank listing.
Things I have tried to fix it
- Rebuild the whole thing several times
- Only bring in small amounts of data
- Stop trying to do an NFS mount
- Searched /r/openmediavault, this forum, youtube videos, the mergefs doc and FAQ
- Prayed to an eldritch god
Most recently, I tried doing a simple cp on the command line from the /srv/DIR with 70mb of files straight to a shared folder on the ufs. (so cp /srv/[uuid]/* /sharedfolder/[unionfs_folder]) and I am able to do an ls on the shared folder with no problem and everything seems to work. But then when I do an rsync listing (so rsync rsync://localhost/[unionfs_folder]) it comes back with an empty listing.
What does work
Thing is, I can do a non-listing rsync to and from a unionfs folder. I can do an rsync job from OMV between the ufs and a shared directory on a USB drive.
I have been able to do remote and local rsync to put files on the drives via the union. The filesystem listing from OMV shows the correct amount of space being used both by the individual drives and the unionfs. Plus, doing an ls /srv/dev-disk-by-label-DRIVE shows all of the files I expect, and they are accessible. My data is still there! I just can't access it via the union.
The SnapRAID content files seem good, too, but I haven't paid a ton of attention to that. A snapraid sync completes without error.
Also, importantly, everything works fine if I just set up shared folders directly on the drives instead of the union. listing, nfs, rsync. it all works. the issue only surfaces when I bring mergerfs into the equation.
My question
HAAALLLP!!!!1
ok, but seriously, seems like my only solution is to stop using unionfs, but that seems like quitting and I can't accept that just yet.
- Has anybody experienced this behavior?
- Am am I doing something dumb and apparently can't read documentation properly?
- What else can I do to further diagnose this issue?