Posts by ZeroGravitas23

    Alright, I wanted to give an update on this as I have had some success in troubleshooting and the pools seem to be holding steady right now.

    These are the steps I have troubleshooted so far in case anyone needs to follow the same steps:

    • Switched location of drives - this caused more drives in random pools to see checksum errors, so no consistency or answers here
    • Scrubbed and cleared the pools in question - afterwards more checksum errors showed up randomly but never an I/O error which indicates something in data/power transfer
    • Checked HBA's - Two Dell PERC Cards and an internal motherboard HBA. All worked good with the correct firmware
    • Switched mini SAS 8087 cables to backplanes - this was to see if certain cables were causing the problem. Nothing found as the checksum errors afterwards were random across cables
    • Mapped out drives In a grid according to the Slot Location on the Norco 4224 case to see if errors were occurring on certain bays. Nothing initially conclusive on this but more info further down.
    • Replaced PSU - This was to see if power was the issue. PSU tester did show old PSU not giving enough power so this may have been part of the cause
    • Finally, looked at grid of drives (17 total) in relation to the backplanes - This I believe was the issue. All drives in question were part of the bottom 3 backplanes of the case.

      • Top 3 backplanes in case go to MB, directly to internal HBA while bottom 3 go to two HBA cards
      • Given the notoriety of Norco and their backplanes I had some extras on hand an swapped them in
      • So far after two days (includes scrubbing and clearing the ZFS errors) there have been no more Checksum error

    I am going to keep checking on the drives, running a couple smart tests and transferring some data on them to see if anything else pops up but it sure seems to be pointing to the backplanes given the testing. It is really frustrating to try and solve these issues as there are so many moving parts in the setup of these server cases but this along with many other threads make me want to switch to a Supermicro server. I can't do it right now but it is definitely in the upgrade path in the future.

    Thank you for the help on this, and if anything changes I will post updates to this.

    Thanks for being interested cabrio_leo!

    In fact it is not fixed yet but I am narrowing it down. I swapped the location of drives in that zpool with two other drives in a different zpool to figure out if it was something to do with the hard drives or something to do with the other hardware in the computer.

    In doing so, after rebooting and scrubbing the pools to fix the checsum issues, more checksum errors showed up but on the pool where the second drives were a part of.

    Next, I ran a memtest86 on the RAM because something seems to going on with the hardware and no files are ever really effected or listed as corrupted in the zpools when the pool gets "degraded". No errors on the ECC RAM showed up from the memory testing so I am now onto the next step.

    The HBA's look good but I might try to reseat them to ensure they are OK. Next on the list is the Backplane and SAS connectors between the HBA's and the backplane. Also, at the same time I want to ensure I look at the PSU as I am getting some really weird Print_req_error: I/O error, dev "hard drive ID", sector #######.

    And I am also unable to complete a SMART Long Test as it keeps saying : Interrupted (host reset).

    I believe most of these oddities are occurring due to the same thing, some sort of inconsistent power or data to the drives but I really need to troubleshoot more as the original drives that had a checksum problem in the zpool are brand new with no recorded SMART issues. Once I get further I will update this post so there is some sort of resolution.

    Hey Everyone,

    Over the last couple of weeks I refreshed my OMV box with a fresh OS install and also transitioned from mergerfs + snapraid to ZFS using the ZFS plugin. Everything went very smooth and it was working fine until a day or two ago when I noticed one of the Zpools had a degraded disk. All of the pools are set up with mirrored vdevs and two vdevs to each pool.

    A single vdev started showing this degraded drive yet there were no reported errors on the writes, reads, or checksums. I am a little stumped as they are new drives (the two in the degraded vdev) and there are no SMART errors on either drive.

    I have attached pictures of the pool (wingclipper in the pictures), the pool status, as well as the SMART info regarding the two drives.

    Also, I am currently trying to run a long SMART test on the first degraded drive to see if any further errors actually show up. Any ideas or help on what might be causing this would be wonderful. Could it be a cable, a backplane (as this is a Norco 4224 server case), or something else?? They are connected from the backplane to an LSI HBA. Once the SMART test finished I might turn off the box and reseat the drives to see if that may do it but I am unsure if it will help any.

    Just to clarify, during the entire time of trying to move from either OMV 2 or OMV 3, I personally did not have any drives connected during or after each installation minus of course the SSD I was trying to install OMV 4 onto.

    I was attempting to install OMV 4 from a USB image onto an SSD with all of the extra data drives removed from the system. Upon installation and rebooting for the first time into the OS it would venture into GRUB and not find any OS drive to boot from even though the only drive available was the SSD that was just used to install OMV 4 onto.

    And Domi, you are 100% correct that the first time when trying to go into the OS it does not use the UUID for the boot. It tried to use something along the lines of /dev/sda when really the correct drive was /dev/sdg1 (this also is a little odd since the OMV installation seems to have made several partitions on /dev/sdg and the boot existed precisely on sdg1.

    So something during installation must be reverting the GRUB setup to using the /dev/*** nomenclature instead of using the UUID. I am not versed with GRUB or Linux enough to understand why Debian is doing this. But it is quite frustrating if you don't know how to navigate through GRUB well. Thank you again for all of the help so far. It has solved the problem for myself, I just wish we knew the exact cause in case other users see the same thing.

    You are 100% correct. After correcting the "Linux" line in the boot menu, booted up, and ran all of the available updates for the system it is now able to boot into OMV 4 successfully after rebooting.

    Thank you for all of the help, and I learned something from this which is the best part.

    So for anyone that seems to be having the same issue:

    Edit GRUB where this is located linux /boot/vmlinuz-4.14 bla bla root=/dev/***
    Replace the root = /dev/*** with the below
    root = UUID = "Your UUID for the main OMV partition"

    @ryecoaaron Any idea why this might be happening to some of us? The install seems to be choosing the incorrect partition when creating the initial boot sequence. Changing it to the precise UUID and then booting into OMV and running any system updates necessary seems to cement the UUID when corrected in GRUB first.

    Just want to make sure other users are aware of how to fix this or why it might be happening.

    Thank you for the quick reply Bohatyr.

    You steps worked like a charm (once I got the UUID correct haha).

    I did notice that the change isn't persistent so how do we permanently change the boot so the UUID sticks? Do we just need to edit fstab?

    This is what is shown on the fstab after successfully changing GRUB and booting into OMV:

    quid7 and Bohatyr are correct from what I can see yet I am still having issues. When installing from a USB drive it boots to GRUB after the install. Going to edit the boot menu brings up what is below:

    From this I can see that the UUID is not right at all. Before trying this install I grabbed the UUID of the drive the OS was to be installed on which is e984d5db-5116-4c31-bfe7-c2a39775f9eb. The UUID's don't match at all and the entire "Search" section of the code here is something I don't think should be there. So I tried removing that entire If --Else Statement, and edited the "Linux" line so the UUID is present instead of the /dev*. That can be seen below:

    Yet, when trying to run the boot from this code another issue pops up. It cannot find the Linux set up:

    I don't know why but during install it definitely is not setting up the boot sequence correct from what I can see. Any ideas on what can be tried to fix the boot?

    ryecoaaron, thanks for jumping in and giving us your thoughts on this. Any help is appreciated.

    I know that the installation guide suggests removing all data drives when installing OMV 3 or 4 and it seems most of us tried to follow that.

    Would you recommend trying the install OMV 4 with the entire array of drives (including the will be OS drive of course) present during the installation? Is there any reason the guide recommends removing the drives in that case other than accidently choosing a data drive as the installation location?

    We could obviously try that within a VM pretty quickly if we wanted to as well.

    Thank you Markess and vialcollet for the responses.

    That very well could be the root of the problem. I have a feeling when installing off a USB drive it some times changes any onboard drives to something other than SDA which obviously seems to mess the installation up once the USB is removed.

    Does anyone know how to reassign the disk we want as the OS drive to SDA during the installation of OMV 4 off of a USB drive? I did not see an option come up to change that.

    I just wanted to comment on this as I also recently tried to move from OMV 2 to OMV 4. Using the base image of OMV 4 installed onto a SSD using a USB image I tried 3 or more times to install the OMV 4 image and I saw the same exact error from above. I had all of the data drives disconnected while installing the OS except for the 1 SSD that was going to be used as the OS drive. Also, that SSD was connected through Sata and not through any raid controllers.

    One thing I think may be the problem which maybe someone above can help replicate is that OMV4 won't boot correctly it's first time if an onboard raid card is built into the motherboard. I believe this may be part of the problem because the motherboard I am using actually has a built in LSI raid controller which you can't disable from initializing on boot. Even if you disable it from initializing in the BIOS or loading an OS it will still try to initialize during boot.

    After trying the installation several times I tried the OMV 3 image using a USB image and it booted up the first time with no issues. I am not sure why OMV 3 doesn't have the same issue as OMV 4 since it sounds like all of our installations were done using the same hardware as our previous versions of OMV.

    Any help on why this may be an issue and how to solve it moving forward would be wonderful in case we want to upgrade to OMV 4.

    I've had OMV2 running on Debian for about 6 months now with very few issues. A week or two ago I tried to access the WebGUI from another computer on the same LAN and wasn't able to find it. Since then I have never been able to connect to the WebGUI. Entering the IP that is assigned to it just comes up with a "This page does not exist" message.

    I can tell OMV is still running though as SMB still works, and within the machine itself Emby and Snapraid still function perfectly fine. All of those were also set up within OMV using the plugins.

    Since this a supermicro build there also is a second NIC which I can use to remote in using IPMI. That still functions properly as well.

    Does anyone have any ideas as to what might be going on? I changed nothing within the software or hardware and it just became inaccessible out of nowhere.

    Alright, I have tried that it still gives me the same exact error in OMV under plugins and Update.

    Now when I do that in the terminal and the system tries to update all of it's sources there are some errors which I was wondering is the underlying cause the issue.

    So it looks like the Wheexy sources and the snapraid sources cannot be found. I am not sure if this is enough to cause the index issue with OMV.

    Any ideas?

    Hey Everyone, I'm new to OMV as well as linux although I have used unix for quite some time.

    I have a media server with Debian(Wheexy) installed as well as mergerfs, snapraid, and OMV. It seems to be having issues updating the plugins list as well as when I go to the Update section in the GUI.

    I'm using Stoneburner 2.2.6 and the latest release of Debian 7.11(listed as in OMV GUI).

    Below is the output of the error when I try to access the Plugin section or the Update section in the web GUI. Any help with this would be greatly appreciated!

    "The index of available plugins does not exist. Please re-synchronize the package index files from their sources.

    Error #6000:exception 'OMVException' with message 'The index of available plugins does not exist. Please re-synchronize the package index files from their sources.' in /usr/share/openmediavault/engined/rpc/ trace:
    #0 [internal function]: OMVRpcServicePluginMgmt->enumeratePlugins(Array, Array)
    #1 /usr/share/php/openmediavault/ call_user_func_array(Array, Array)
    #2 /usr/share/php/openmediavault/ OMVRpcServiceAbstract->callMethod('enumeratePlugin...', Array, Array)
    #3 /usr/sbin/omv-engined(500): OMVRpc::exec('Plugin', 'enumeratePlugin...', Array, Array, 1)
    #4 {main}"