Shared Folder permission bug - maybe ZFS related?

peterkeyzers · 16. Juni 2021

Hi,

Having had my RAID setup fall over recently, I've been in the process of setting up a ZFS filesystem instead (thanks geaves!). It's going OK so far, but have noticed a bug when trying to set the ACL on a shared folder for the ZFS pool to allow users to write to a shared folder:

Code

Error #0:
OMV\ExecException: Failed to execute command 'export PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin; export LANG=C.UTF-8; setfacl --remove-all --recursive -M '/tmp/setfaclpq5Tzs' -- '/srv/test/music/' 2>&1' with exit code '1': setfacl: /srv/test/music/: Operation not supported in /usr/share/openmediavault/engined/rpc/sharemgmt.inc:1061
Stack trace:
#0 /usr/share/php/openmediavault/rpc/serviceabstract.inc(588): Engined\Rpc\ShareMgmt->Engined\Rpc\{closure}('/tmp/bgstatuss1...', '/tmp/bgoutputqH...')
#1 /usr/share/openmediavault/engined/rpc/sharemgmt.inc(1068): OMV\Rpc\ServiceAbstract->execBgProc(Object(Closure), NULL, Object(Closure))
#2 [internal function]: Engined\Rpc\ShareMgmt->setFileACL(Array, Array)
#3 /usr/share/php/openmediavault/rpc/serviceabstract.inc(123): call_user_func_array(Array, Array)
#4 /usr/share/php/openmediavault/rpc/rpc.inc(86): OMV\Rpc\ServiceAbstract->callMethod('setFileACL', Array, Array)
#5 /usr/sbin/omv-engined(537): OMV\Rpc\Rpc::call('ShareMgmt', 'setFileACL', Array, Array, 1)
#6 {main}

The operation itself seems to succeed, in that I can now write via the share, but the error is a bit odd. Am I doing something wrong? Happy to provide more info if need be.

Thanks,

Peter.

crashtest · 20. Juni 2021

I'm not sure if the following commands are now part of a ZOL (ZFS On Linux) installation, but they were needed not long ago.

Run the following on the command line:
(Substitute the name of your pool in for ZFS1)

zfs set aclinherit=passthrough ZFS1

zfs set acltype=posixacl ZFS1

zfs set xattr=sa ZFS1

This is best done before copying data into the pool.

peterkeyzers · 21. Juni 2021

Zitat von crashtest

I'm not sure if the following commands are now part of a ZOL (ZFS On Linux) installation, but they were needed not long ago.

Run the following on the command line:
(Substitute the name of your pool in for ZFS1)

zfs set aclinherit=passthrough ZFS1
zfs set acltype=posixacl ZFS1
zfs set xattr=sa ZFS1

This is best done before copying data into the pool.

Alles anzeigen

Ah, didn't realise this. Thanks for letting me know. I had already copied 2TB of data to the pool, but in a separate problem, it seems to have become degraded/corrupted after the server's weekly reboot, which I'm somewhat puzzled by. So I might need to recreate the pool anyway, in which case I can apply this

crashtest · 21. Juni 2021

Zitat von peterkeyzers

it seems to have become degraded/corrupted after the server's weekly reboot, which I'm somewhat puzzled by. So I might need to recreate the pool anyway, in which case I can apply this

The only think that I can think of, that might result in data corruption, is a hardware fault or a misconfiguration. Also, you might look at your drives' SMART stats. ZFS can correct corruption with a scrub "but" data must be written to disk, without corruption, the first time.

peterkeyzers · 22. Juni 2021

Zitat von crashtest

The only think that I can think of, that might result in data corruption, is a hardware fault or a misconfiguration. Also, you might look at your drives' SMART stats. ZFS can correct corruption with a scrub "but" data must be written to disk, without corruption, the first time.

It's all been very odd. SMART is all fine, the data was all written smoothly (just over 2TB), zpool status returned no errors. (I've spent a week testing different configurations which have all survived reboots). Then, after this Sunday's scheduled reboot, bam.

I'm using an Icybox 8-bay JBOD enclosure with 4 drives in it, connected via eSata, not USB. Using drive Id, not names, so trying to follow all the good advice. 2 2-disk mirror vdevs, and about 20 different datasets with varying record sizes. So nothing too outlandish.

I did a mixture of creating the pool in OMV, but the datasets on the CLI. Would hope that wouldn't affect anything though.

One potentially interesting point. My layout was (roughly) as follows:

trove (pool)

-> development (dataset)

-> music (dataset)

-> docker (dataset)

-> config (dataset)

Because I wanted these shares to be visible at the levels in my diagram, I had to create the share as:

device: trove

mountpoint: music

as there wasn't an option to use trove/music as the device and no mountpoint. Would have hoped this wouldn't cause issues, but am open to trying something else!

Do you think the failed command above could have messed things up?

crashtest · 22. Juni 2021

Zitat von peterkeyzers

Because I wanted these shares to be visible at the levels in my diagram, I had to create the share as:

device: trove
mountpoint: music

I'm not exactly sure what you mean by the above. However:

With ZFS, the pool AND each child filesystem are separate block devices. That is to say they are all treated, by Linux, as is they are all individual hard drive partitions.

You can create a pool (in your case trove) then create a Linux sub-directory underneath of it. That will work and you can mount the sub-directory as you mentioned. However, you don't get the benefits of ZFS child "filesystems" that should be created within the parent pool.

_____________________________________________________

To create filesystems, in the GUI:

First create a pool. Immediately run the command lines noted above to create posix compliant ACL's / permissions :
Then click on / highlight the pool. Follow that by clicking on the button ADD OBJECT. The default is a Filesystem. Provide a name for the new filesystem and leave the mount point blank.

Once done, the filesystem will appear in the ZFS plugin, under the pool name. The filesystem will have individual, editable, properties and (more importantly) it can have it's own snapshots. It will also appear in Storage, Filesystems as an individual drive device.

Using a ZFS filesystem (not a Linux folder), a shared folder should be created as follows:
**Note the / in the path entry. This is necessary but not the default.**

peterkeyzers · 22. Juni 2021

Zitat von crashtest

I'm not exactly sure what you mean by the above. However:

With ZFS, the pool AND each child filesystem are separate block devices. That is to say they are all treated, by Linux, as is they are all individual hard drive partitions.

You can create a pool (in your case trove) then create a Linux sub-directory underneath of it. That will work and you can mount the sub-directory as you mentioned. However, you don't get the benefits of ZFS child "filesystems" that should be created within the parent pool.

_____________________________________________________

To create filesystems, in the GUI:

First create a pool. Immediately run the command lines noted above to create posix compliant ACL's / permissions :
Then click on / highlight the pool. Follow that by clicking on the button ADD OBJECT. The default is a Filesystem. Provide a name for the new filesystem and leave the mount point blank.

Once done, the filesystem will appear in the ZFS plugin, under the pool name. The filesystem will have individual, editable, properties and (more importantly) it can have it's own snapshots. It will also appear in Storage, Filesystems as an individual drive device.

Using a ZFS filesystem (not a Linux folder), a shared folder should be created as follows:
**Note the / in the path entry. This is necessary but not the default.**

Alles anzeigen

I tried that / in the path entry and that was what seemed to blow up, in that the error seemed to be trying to change permissions at the root of my os drive. So I therefore tried to use the trove device and typed a path of music, even though trove/music is a formal dataset. I don't think it should have messed anything up though?

Probably though, the error was due to what you originally corrected me on, so I think if I apply the acl settings, it should be ok. Will nuke the pool and start again from scratch, as it's been resilvering for over a day and hasn't made any progress at all, so is stuck somewhere...

crashtest · 22. Juni 2021

- If you have a pool /trove , then created a share with music/ in the shared folder dialog, you would have a ZFS pool with a Linux directory "music" at it's root.
- If you followed that up with a ZFS file system created in the pool /trove named music, an impossible situation has been created for the Linux OS.

There would one path /trove/music that leads to two places:

- /trove/music as an individual block device, (AND)
- /trove/music a block device with a Linux sub folder.

If that exists in your install, that would cause all kinds of weird effects.

peterkeyzers · 22. Juni 2021

Zitat von crashtest

- If you have a pool /trove , then created a share with music/ in the shared folder dialog, you would have a ZFS pool with a Linux directory "music" at it's root.
- If you followed that up with a ZFS file system created in the pool /trove named music, an impossible situation has been created for the Linux OS.

There would one path /trove/music that leads to two places:

- /trove/music as an individual block device, (AND)
- /trove/music a block device with a Linux sub folder.

If that exists in your install, that would cause all kinds of weird effects.

Alles anzeigen

Yeah, when you put it that way.... Thanks for your help, I'll do it the proper way and try again!

crashtest · 22. Juni 2021

(For users who may read this thread:)

Let us know what happens with the new build.

If you're interested in automated snapshots, give this -> doc a look.

peterkeyzers · 28. Juni 2021

OK, so reporting back after a week. Re-setup the pool, followed advice in the thread here, copied 2TB of data to it, ran a scrub, and 0 errors reported.

Ran my scheduled weekly reboot, and I'm immediately seeing array errors.

Code

➜  ~ zpool status
  pool: trove
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 0B in 10:26:55 with 0 errors on Mon Jun 28 01:12:40 2021
config:

        NAME                                          STATE     READ WRITE CKSUM
        trove                                         DEGRADED     0     0     0
          mirror-0                                    DEGRADED     0     0     0
            ata-ST3000VN007-2AH16M_ZGY86G7F           FAULTED      0    53     0  too many errors
            ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N0FUSKEA  DEGRADED     0    26     0  too many errors
          mirror-1                                    DEGRADED     0     1     0
            ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N3ADU7R8  ONLINE       0     6     0
            ata-ST3000VN007-2AH16M_ZGY87H31           FAULTED      0    30     0  too many errors

Alles anzeigen

I'm completely stumped as to why a reboot could cause this. The scheduled job is nothing more than running reboot with no additional params as root. Any thoughts/suggestions appreciated at this point, because it's incredibly frustrating!

Update: Investigating this and this as they seem promising.

Peter.

crashtest · 28. Juni 2021

Because a few boxes must be checked:

- SMART stat's on your boot drive are clean.
- Did check the ISO download for integrity?
- I'm assuming you're no using ZOL encryption.
- Check smart stat's on data drives and insure all are clean (or at least within allowed spec's.)
- I'm assuming that you've ran or will run Extended SMART tests on your drives. Results?

Curiously, these are "write" errors. These are not "read" errors which are typical with failing hard drives.

- Are you using -> WD Red's (Shingled) Drives?

_____________________________________________________

Other considerations:

I would look at syslog and others in the GUI, and at dmesg on the CLI . For what condition, I can't tell you because I haven't seen this before.

Still one of the greater probabilities is with hardware. You said at the beginning that you had a RAID issue. Is this the same hardware and drives?

General things to look at:
What are you using for a drive host adapter? The MOBO's SATA ports? (Re-seat the data cables. It only takes one issue, with one drive, to hose things up.)

Power Supply 12V?
Bad contacts on PS plugs? (It only takes one bad or marginal power contact to create problems if drives are daisy chained.)

Then I would break it down and try again.

1. clean build (verifying the ISO with the MD hash)

2. format one pair of drives for a mirror. You might even consider using 1 drive, as a "basic" ZFS volume, with all other drives are unplugged, for a test. (One drive having issues can cause problems with all of them. Unfortunately, this requires the process of elimination.)
3 Add some data (I don't think it needs to be a lot), scrub, test with a reboot.

peterkeyzers · 28. Juni 2021

Hi,

Thanks for the detailed response!

Zitat von crashtest

Because a few boxes must be checked:
- SMART stat's on your boot drive are clean.
- Did check the ISO download for integrity?
- I'm assuming you're no using ZOL encryption.
- Check smart stat's on data drives and insure all are clean (or at least within allowed spec's.)
- I'm assuming that you've ran or will run Extended SMART tests on your drives. Results?

Curiously, these are "write" errors. These are not "read" errors which are typical with failing hard drives.

- Are you using -> WD Red's (Shingled) Drives?

Alles anzeigen

SMART stats are all good, did a long test for each drive with no problems found. This is an installation of OMV that's around 2 years old, and no problems apart from this one, so if the ISO was bad, would have seen it before now. No encryption is used.

Funnily enough, the first problem with the RAID array, I checked the disks, and one of them was a SMR drive. I replaced this with a Seagate Ironwolf NMR, which reported no SMART errors. So think I'm good on the types of disks in use.

I'm running the 4 drives from an ICYBOX 8-bay JBOD box. Connected via eSATA, not USB. Bought a new connector cable for it last Friday (shielded, just in case), but no change there. The server (a Dell T40) has one of these for the eSATA connection. I checked the seating of it today at lunchtime, all seems to be fine.

So after upping the sleeps as follows in /etc/defaults/zfs:

Code

# Wait for this many seconds in the initrd pre_mountroot?
# This delays startup and should be '0' on most systems.
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
ZFS_INITRD_PRE_MOUNTROOT_SLEEP='30'

# Wait for this many seconds in the initrd mountroot?
# This delays startup and should be '0' on most systems. This might help on
# systems which have their ZFS root on a USB disk that takes just a little
# longer to be available
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
ZFS_INITRD_POST_MODPROBE_SLEEP='30'

Alles anzeigen

I rebooted, and recreated an empty pool (using an ashift of 12). Without even creating a dataset or writing anything into it, I got a selection of read and write errors, and the pool quickly made itself unavailable.

This is in stark contrast to last week, where for 4 or 5 days, I had perfect behaviour (I nervously checked zpool status many times a day). The scrub ran for some time (12h) and reported no errors.

My suspicion is something hardware related, but just in case, I'm testing out a mergerfs+snapraid setup, just to rule out ZFS as the cause, in case it's a bit sensitive. I'm really starting to run out of ideas though!

I captured the zdb ouptut if that's of any use?

Thanks,

Peter.

crashtest · 29. Juni 2021

Zitat von peterkeyzers

My suspicion is something hardware related, but just in case, I'm testing out a mergerfs+snapraid setup, j

That's a good idea and mergerfs can detect bit-rot and other errors in scrubs. But note that mergerfs isn't as sensitive to read / write latency as ZFS or other forms of RAID might be. Mergerfs doesn't need equal bandwidth.

Zitat von peterkeyzers

I'm running the 4 drives from an ICYBOX 8-bay JBOD box. Connected via eSATA, not USB.

Note that the ICYBOX is a potential source of the issue and *ALL* drives are in it. Note to mention, when you're rebooting the server, you're not rebooting the ICYBOX. The reconnection might be hosing something up. Try shuting down the server, then shutting down the ICYBOX. Boot the ICYBOX, then boot the server. See what happens.

You could try connecting with USB3 interface as a test. It's a different interface / bridge. (Make sure you connect to a USB3 port on the Dell.)

**Edit** The E-Sata card is a generic Chinese card, probably with no support. I can't help but wonder what the latency and throughput might be. Definitely try the USB3 interface for a test.

peterkeyzers · 30. Juni 2021

Hi,

So, more testing! Setup a 3 disk merger FS system, with a single parity disk. All 3TB disks. Copied around 1TB of data yesterday, using epmfs as the mergerfs policy, ended up rebalancing using the mergerfs tools. Ran sync and 100% scrub, no problems reported.

Tried again this morning, wiping files (not formatting the disks though), and remounting the mergerfs filesystem with mfs instead of epmfs. Then tried copying over another 1TB of data. This time, I see the following part-way through:

Code

Jun 30 14:15:01 core kernel: [ 9087.053899] EXT4-fs error (device sdc1): ext4_find_extent:920: inode #114559341: comm mergerfs: pblk 458260480 bad header/extent: invalid magic - magic d8ff, entries 57599, max 4096(0), depth 17994(0)

Guessing this is an HDD problem of some sort, or should I be nuking the disks altogether with a secure format and ext4 filesystem creation?

Didn't think to mention earlier, but I'm using syncthing (running under Docker) to "copy" the files across. Is it possible this is causing an issue?

Haven't tried the USB3 connection yet... Also, if there's better recommendations for an interface card (ideally not costing hundreds of pounds :)), would love to hear!

Thanks,

Peter.

crashtest · 30. Juni 2021

All of your HDD's have the same problem? Statically, that's a very long shot. However, if syncthing under Docker was used for the copies, that's worth following up. Rsycn, which is clean and reliable, can be used to copy host to host.

I still like the interface possibility. Going USB3 eliminates the eSATA expansion card (which, notably, has some negative reviews for longevity) and there's a change of interface in the ICYBOX. A few possibilities are eliminated.

peterkeyzers · 3. Juli 2021

Zitat von crashtest

All of your HDD's have the same problem? Statically, that's a very long shot. However, if syncthing under Docker was used for the copies, that's worth following up. Rsycn, which is clean and reliable, can be used to copy host to host.

I still like the interface possibility. Going USB3 eliminates the eSATA expansion card (which, notably, has some negative reviews for longevity) and there's a change of interface in the ICYBOX. A few possibilities are eliminated.

Time for an update!

Seems a bit random which ones were failing, which is definitely a red flag. Syncthing is used to sync between different machines, some of which aren't rsync-friendly (e.g. overseas with non-tech-literate folks), so is a bit of a requirement. Worth mentioning that both the Icybox and card are new (e.g. a couple of months old) so longevity would hopefully not be an issue here - although that doesn't rule out faulty equipment in the first place!

Current experiment is with 3 drives mounted directly in the server, syncthing syncing between a few hosts. So far, no issues at all, will leave running for a day or so. Have rebooted a couple of times so far, no obvious explosions. If this looks good, will try USB3 from the Icybox to isolate whether it's the card or the box causing the issue...

peterkeyzers · 8. Juli 2021

OK, so another update after nearly a week!

I've managed to run several drives in the main server box itself, running mergerfs + snapraid quite happily, no issues at all. Then switched to a USB3-connected box with ZFS, 2 2-disk mirrored vdevs, so a total of just over 5TB of usable space. Uploaded 2TB of data, have done several reboots, and two full scrubs, and no problems at all.

So looks like the next thing to try is swapping out the eSATA card. If you have any solid recommendations, would be glad to hear them! But will do some reading online to try and find some good ones to pick from...

crashtest · 9. Juli 2021

I don't use consumer SATA cards. My preference if for used server hardware. Good, 8 port HBA's, can be had on the auction sites for less than $40. Flash it to JBOD and you'll have all the SATA/SAS ports you might need, if your case can handle more drives. Here's a thread if you're interested in the route. -> Cheap RAID cards.
_______________________________

In your case, the problem would seem to be narrowed to two possibilities:

- The ICY Box's eSATA interface (or)
- The eSATA adaptor (This one gets my vote. It's a generic card and the reviews had it as failing prematurely.)

Take your pick of the two above but it would make sense to go with the lowest cost option first. That appears to be the eSATA card.

Here's an article for -> picking a replacement card.
(Again, I've never used one of these cards, so I can't recommend anything.)

geaves · 9. Juli 2021

Zitat von crashtest

Again, I've never used one of these cards, so I can't recommend anything

Neither could I, reminds me of the long thread where the OP had used a cheap Chinese generic sata card, in cases like this the best option is not the cheapest and a 'branded' card such as StarTech that has their own site and support is the better option.

But if this was me I'd stick with USB3 and I found this from back in 2013

Shared Folder permission bug - maybe ZFS related?

Jetzt mitmachen!

Tags