OMV - ZFS - Web GUI

    • Resolved
    • OMV 4.x
    • OMV - ZFS - Web GUI

      I made the switch from FreeNAS to OMV this past weekend. I went through some trouble with the installer on my hardware, but I was able to install OMV4 over Debian 9, and successfully import my pool from FreeNAS. Originally, the pool had no issues and the pool never had issues with FreeNAS. Shortly after getting things up and running, while I was working on plugins and adding functionality, I noticed that my RaidZ2 pool was degraded. 2 of my 4 drives were corrupt/offline and my spare drive was corrupt. I have forgotten the exact error on the two pooled drives. I rebooted, and the drives returned to Online without issues. I continued on, figuring one of the plugin's I installed messed with the mounts. The following day, I received an email alert that 2 drives encountered SMART errors. I logged in that evening to find the same three drives in states of error. At this point, to restore the drives, I tried a few thing including scrubbing the disks, and zpool clear to see if I could rid myself of the error messages. Finally, I offlined the drives, removed the hot spare, wiped it and shuffled the drives around this way until both pool drives were re-silvering and my hot spare now shows as available. I don't have much time left on the re-silvering process (started last night) and my intention is to reboot and give the drives a few days to see if any issues arise. The SMART errors were 2 ATA Error's, SMART Read Log and SMART Write Log, which I think I've researched enough to say this isn't a huge issue. The errors occurred on the two still functional drives in the degraded pool.

      Any insight into what happened above would be appreciated, but this brings me to the only issue that I haven't managed to solve yet. When I navigate to the ZFS page inside the Web GUI, I get a general "Error" message with the description "communication failure". This occured prior to resilvering, so rebooting the machine has been tested. I strongly feel like this is related to the zpool clear command, but I'm unsure. My first thought is to get the drives up and working again, let things simmer to observe any additional errors. At this point I would then remove the pool, uninstall the ZFS plugin, reinstall ZFS, and import the pool again. I wanted to reach out to the forum first to see if anyone can shoot any holes in this idea? My concern is the LXD containers I'm running, file shares, and other things, but I'm reasonably confident that I'll be ok as long as I stop everything before proceeding.

      My hardware is -

      Xeon E3-1231V3B Processor
      Super Micro MBD-X10SL7-F-O Motherboard (Raid Controller disabled/JBOD)
      2x 4 5.25" iStardrive cages with hot-swap backplanes
      5x 3TB WD Red Drives in Raid Z2 with Hot Spare
      32 GB Crucial ECC RAM
      Corsair RM Seriods 450W 80+Gold PSU

      In case anyone finds that matters....

      Also, since this is my first post, just wanted to give huge thank you to the Dev(s?) behind OMV. This is one of the few if not only package of it's kind and the functionality is amazing. Keep up the great work, it's impressive! If things continue to run smoothly, and there's the ability to do so, I'll put my money where my mouth is and donate.

      Thanks again!
    • RFrost619 wrote:

      At this point I would then remove the pool, uninstall the ZFS plugin, reinstall ZFS, and import the pool again.
      While I don't use zfs for anything important, I don't think uninstalling the plugin will help here. Does the zfs plugin ever show you information? The communication error message is usually caused by high load or something timing out.
      omv 4.1.14 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.13
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!
    • Well, this is embarrassing... Menu now loads just fine... It consistently errored out until this point.

      It seems that the pools have finished resilvering as well. I could have sworn this was an issue prior to my switcheroo. I was also up with this until 2 this morning, maybe I’m speaking from my rear end.

      Just for clarification:

      When we talk about high usage, which I read about and dismissed, are we talking system utilization? My RAM/CPU useage was light, less than 5% CPU and less than 30% RAM. Or, in this case, pool IO utilization. I could see where resilvering 50% of my pool would cause a few read/writes....

      Thanks for the quick response!
    • RFrost619 wrote:

      When we talk about high usage, which I read about and dismissed, are we talking system utilization? My RAM/CPU useage was light, less than 5% CPU and less than 30% RAM. Or, in this case, pool IO utilization. I could see where resilvering 50% of my pool would cause a few read/writes....
      Both could cause the problem but in this case, high disk IO while it is trying to get a lot of info about the pool could cause it too.
      omv 4.1.14 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.13
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!