ZFS Mirror Pool creation fails on LUKS-encrypted devices

    • OMV 4.x
    • Resolved
    • ZFS Mirror Pool creation fails on LUKS-encrypted devices

      I have a really weird problem with setting up a ZPool consisting of 2-HDD-drives mirror.

      I'm working with:
      * openmediavault v4.1.12
      * linux kernel 4.18.6-1~bpo9+1
      * openmediavault-zfs v4.0.4 custom build (built using current master, with my "by-id" device discovery fix)
      * openmediavault-luks v4.0.3 custom build (using subzero79's upgrade from this thread)
      * two 10TB WD Red Pro drives

      So, my steps to create this pool were as follow:
      1. Two new LUKS devices have been created
      2. Both devices have been decrypted (and put into crypttab, but that's irrelevant)
      3. Both devices have been selected to be used in mirror configuration for the new zpool using "by-id" device mapping (on the list, they appear as "dm-1" and "dm-2", "dm-0" is another drive I have that I'm using in a "basic" zpool)
      The pool creation fails at "zpool create" command:

      Shell-Script

      1. $ zpool create -m "/zfs-pools/vault-main-hdd-mirrorpool" "vault-main-hdd-mirrorpool" mirror dm-name-sdb-crypt dm-name-sdc-crypt
      2. cannot create 'vault-main-hdd-mirrorpool': one or more devices is currently unavailable
      What's weird about this problem, is that right after that... the LUKS devices appear to be encrypted again.

      I've tried using both of these drives in "basic" pools, for both of them I was able to successfully create new pools, so the problem seems to be strictly related to mirrored vdevs.

      I've also tried to manually create the mirrored pool, and what I found out is that the creation fails only if I create the GPT label for these drives (which is what OMV-ZFS Plugin does automatically before invoking the "zpool create" command). If the GPT labels were not created, the pool creation command works just fine... I was able to use that pool to create a new volume, place files on that volume using shares etc...

      And now, the fun part - this problem is NOT reproducible on a Virtual Machine (or at least I was not able to reproduce it with my virtual setup). I've created a VM with a simple setup of OMV, upgraded it to the latest version, installed the same two plugins as my bare metal machine, used the same drives layout as my machine (so, one SATA drive for the OS, one NVMe drive for the first, basic, encrypted pool, and two SATA drives which are supposed to emulate my WD Red's), however, I've set them all to only 32GB of disk space (not really sure if that matters). Then, I've created the same LUKS devices, decrypted them, and... I was not able to reproduce the problem - the mirrored zpool was created without errors, both with and without GPT labels.

      As you can see, technically speaking, I do have a workaround for the problem already (the manual pool creating without partitioning devices first), but I'm trying to wrap my head around this problem, understand it, and maybe push a patch for the ZFS plugin. As far as I understand, it's not actually required to create GPT label before creating a pool, so maybe that's the way to go with this (however, I won't deny that I'm not a ZFS expert and I couldn't find a definitive answer for that question in ZOL's documentation).

      So, does anyone have any idea what the hell is going on in here, and what might be the cause of this problem? Honestly, I'm out of ideas what to debug next at this point, so any hint might be helpful.
    • I'll try to reproduce this, to this on a VM you need to overprovision the disk. Is safe just don't fill the disks with data.

      I think the issue is the GPT label creation after i commented out it started to work.

      The strange thing is that then i reverted the changes to i went to default master branch, destroy the pool and labels and creation worked this time.

      I found another bug, when deleting a pool, seems like there is a string being added to the device to clear, instead of doing

      zpool labelclear -f /dev/dm-1

      this comes

      zpool labelclear -f /dev/dm-11
      New wiki
      chat support at #openmediavault@freenode IRC | Spanish & English | GMT+10
      telegram.me/openmediavault broadcast channel
      openmediavault discord server
    • subzero79 wrote:

      ...
      zpool labelclear -f /dev/dm-1

      this comes

      zpool labelclear -f /dev/dm-11
      I've already reported it here: github.com/OpenMediaVault-Plug…nmediavault-zfs/issues/50
      The problem is not limited to DM devices only. The same thing happens when using eg. NVMe drives.

      The biggest problem is that getDevDisks assumes the ZFS label has to be removed from the first partition (at least that what's eg. "sda1" is).
      First, it's not always the case (with LUKS DM devices we were using the whole drive),
      second, the same naming scheme is not valid for other devices (eg. NVMe and apparently CCISS devices have a letter "p" before partition number).

      But I don't think this is related to the main problem - I think I've already re-encrypted (as in, created a new LUKS device, not just locked & unlocked) my HDD drives and the creation problem still occurred.
      As we've already both said, the problem has to do with the GPT partition scheme creation - somehow, it collides with "zpool create" (but it is still a mystery to me why zpool creation locked my devices).

      I'll try to create a new VM with overprovisioned drives and see if that makes any difference.

      The post was edited 1 time, last by dziekon ().

    • subzero79 wrote:

      Seems like the creation of gpt is not necessary whether is straight block device or device mapper.
      This plugins needs to be completely redone, it has been patched through the years to work.
      Removing the GPT creation from the RPC is pretty straightforward, so even though I agree that the current plugin's state is a mess, I could still quickly create a patch for that.

      However, are we 100% sure that this is the way to go? I really don't want to waste my time for a patch that won't be accepted because we didn't think of some edge case, or because it's not actually guaranteed to work that way.
    • dziekon wrote:

      However, are we 100% sure that this is the way to go?
      Not really. I would do a little bit more research. Maybe someone here works as sysadmin with zfs in production in solaris, bsd or linux can provide more insight. The original person that created the plugin backend (years ago) came from solaris environment, having seen him in a while.

      cc @miras
      New wiki
      chat support at #openmediavault@freenode IRC | Spanish & English | GMT+10
      telegram.me/openmediavault broadcast channel
      openmediavault discord server
    • dziekon wrote:

      However, are we 100% sure that this is the way to go?
      zfs on linux will create the gpt if it isn't there. We aren't changing older versions of the plugin where it might act different. So, I would say it can removed.
      omv 4.1.15 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.13
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!
    • Ok, it took me some time to prepare a fix for the GPT labeling problem. I took my time to prepare a more "future-proof" solution that also solves the devices referencing problems, and should be a solid foundation for future rework of the plugin. Feel free to review my code here: github.com/OpenMediaVault-Plug…penmediavault-zfs/pull/54

      So far, I've been able to confirm that the problem from the original post is now gone on my VM and real test machine. I was able to use HDD SATA drives, an NVMe drive and LUKS-encrypted devices without any problems in "basic" and "mirrored" configurations. I've also added a bunch of tests in the plugin's code itself, with more complex zpool setups.
    • dziekon wrote:

      Feel free to review my code here
      That is an impressive amount of work. We appreciate all the help we can get :)
      omv 4.1.15 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.13
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!
    • ZFS Mirror Pool creation fails on LUKS-encrypted devices

      @dziekon

      Great, an additional developer for the zfs plugin is more than welcome. Thanks for the work you have done, even though I don’t have any problem at the moment. ;)

      Regards Hoppel
      ---------------------------------------------------------------------------------------------------------------
      frontend software - tvos | android tv | libreelec | win10 | kodi krypton
      frontend hardware - appletv 4k | nvidia shield tv | odroid c2 | yamaha rx-a1020 | quadral chromium style 5.1 | samsung le40-a789r2
      -------------------------------------------
      backend software - debian | openmediavault | latest backport kernel | zfs raid-z2 | docker | emby | unifi | vdr | tvheadend | fhem
      backend hardware - supermicro x11ssh-ctf | xeon E3-1240L-v5 | 64gb ecc | 8x10tb wd red | digital devices max s8
      ---------------------------------------------------------------------------------------------------------------------------------------
    • Ok, so now, with my PR merged into master, I have a question for you @ryecoaaron (I actually tried to ask that question on Discord, but it seems you are not present there, and it looks like @subzero79 does not show up there very often) - when this is going to be released as a plugin update?
      Or, to better phrase my question, what is the general plugin deployment cycle for all stuff released as a part of omv-extras repo?

      Looks like my previous PR (#53 on Github) has not been deployed either, so I'm just curious what is the usual ETA of plugin updates. Obviously, I can (and already did) build it myself, but it still feels a bit wrong to use unofficial versions. BTW, what's up with "Releases" section of ZFS plugin? The last version there is marked as 3.X.Y, which is not in sync with the real state of omv-extras repo, and it doesn't look very promising for all new developers willing to work on this plugin.

      The post was edited 1 time, last by dziekon ().

    • dziekon wrote:

      I actually tried to ask that question on Discord, but it seems you are not present there
      Discord? Never used it.

      dziekon wrote:

      when this is going to be released as a plugin update?
      Or, to better phrase my question, what is the general plugin deployment cycle for all stuff released as a part of omv-extras repo?
      I probably ran out of time and forgot. There isn't a cycle really. Plugins are released when the maintainer says they are ready.

      dziekon wrote:

      what's up with "Releases" section of ZFS plugin?
      That was something one developer was doing. Most do not do this. I don't see much benefit in adding releases for things are in the package repo.

      dziekon wrote:

      it doesn't look very promising for all new developers willing to work on this plugin.
      Because there is no up to date release? There are commits and an updated changelog which I consider the source of truth over the releases.
      omv 4.1.15 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.13
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!
    • Maintaining "releases" section (or tags directly in the repo) has a benefit of easier "release point" discovery for developers. Adding a tag adds an easily discoverable point in repo's history which determines the boundary of said version, which for example can be used to produce code diffs or commit logs between versions. Really useful when someone is trying to understand why, when and how certain things have been implemented or changed (which cannot be comprehensively expressed through debian's package changelog; but it serves a different purpose anyway), especially in open source projects, where literally anyone can decide that they want to help move things forward (but to do this, they need to understand how people before them "did things"). Plus, it's like two or three clicks to do that on Github, so it's really not that time consuming.
    • dziekon wrote:

      Maintaining "releases" section (or tags directly in the repo) has a benefit of easier "release point" discovery for developers. Adding a tag adds an easily discoverable point in repo's history which determines the boundary of said version, which for example can be used to produce code diffs or commit logs between versions. Really useful when someone is trying to understand why, when and how certain things have been implemented or changed (which cannot be comprehensively expressed through debian's package changelog; but it serves a different purpose anyway), especially in open source projects, where literally anyone can decide that they want to help move things forward (but to do this, they need to understand how people before them "did things"). Plus, it's like two or three clicks to do that on Github, so it's really not that time consuming.
      I understand how it works and agree with all of that. I would love to have everything official and done the right way (more automation, documentation, etc) but some things are just low on my priority list. Sadly, most things OMV on low on my priority list anymore. And because I wrote a developer plugin to build packages/commit to github, I rarely go to the github page let alone go there to create a release. If you want to add this feature to the developer plugin, I am all for it.
      omv 4.1.15 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.13
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!