Beiträge von ZeroGravitas23

ZeroGravitas23 · 29. Januar 2020

Alright, I wanted to give an update on this as I have had some success in troubleshooting and the pools seem to be holding steady right now.

These are the steps I have troubleshooted so far in case anyone needs to follow the same steps:

Switched location of drives - this caused more drives in random pools to see checksum errors, so no consistency or answers here
Scrubbed and cleared the pools in question - afterwards more checksum errors showed up randomly but never an I/O error which indicates something in data/power transfer
Checked HBA's - Two Dell PERC Cards and an internal motherboard HBA. All worked good with the correct firmware
Switched mini SAS 8087 cables to backplanes - this was to see if certain cables were causing the problem. Nothing found as the checksum errors afterwards were random across cables
Mapped out drives In a grid according to the Slot Location on the Norco 4224 case to see if errors were occurring on certain bays. Nothing initially conclusive on this but more info further down.
Replaced PSU - This was to see if power was the issue. PSU tester did show old PSU not giving enough power so this may have been part of the cause
Finally, looked at grid of drives (17 total) in relation to the backplanes - This I believe was the issue. All drives in question were part of the bottom 3 backplanes of the case.
- Top 3 backplanes in case go to MB, directly to internal HBA while bottom 3 go to two HBA cards
- Given the notoriety of Norco and their backplanes I had some extras on hand an swapped them in
- So far after two days (includes scrubbing and clearing the ZFS errors) there have been no more Checksum error

I am going to keep checking on the drives, running a couple smart tests and transferring some data on them to see if anything else pops up but it sure seems to be pointing to the backplanes given the testing. It is really frustrating to try and solve these issues as there are so many moving parts in the setup of these server cases but this along with many other threads make me want to switch to a Supermicro server. I can't do it right now but it is definitely in the upgrade path in the future.

Thank you for the help on this, and if anything changes I will post updates to this.

ZeroGravitas23 · 10. Januar 2020

Thanks for being interested cabrio_leo!

In fact it is not fixed yet but I am narrowing it down. I swapped the location of drives in that zpool with two other drives in a different zpool to figure out if it was something to do with the hard drives or something to do with the other hardware in the computer.

In doing so, after rebooting and scrubbing the pools to fix the checsum issues, more checksum errors showed up but on the pool where the second drives were a part of.

Next, I ran a memtest86 on the RAM because something seems to going on with the hardware and no files are ever really effected or listed as corrupted in the zpools when the pool gets "degraded". No errors on the ECC RAM showed up from the memory testing so I am now onto the next step.

The HBA's look good but I might try to reseat them to ensure they are OK. Next on the list is the Backplane and SAS connectors between the HBA's and the backplane. Also, at the same time I want to ensure I look at the PSU as I am getting some really weird Print_req_error: I/O error, dev "hard drive ID", sector #######.

And I am also unable to complete a SMART Long Test as it keeps saying : Interrupted (host reset).

I believe most of these oddities are occurring due to the same thing, some sort of inconsistent power or data to the drives but I really need to troubleshoot more as the original drives that had a checksum problem in the zpool are brand new with no recorded SMART issues. Once I get further I will update this post so there is some sort of resolution.

ZeroGravitas23 · 7. Januar 2020

Hey Everyone,

Over the last couple of weeks I refreshed my OMV box with a fresh OS install and also transitioned from mergerfs + snapraid to ZFS using the ZFS plugin. Everything went very smooth and it was working fine until a day or two ago when I noticed one of the Zpools had a degraded disk. All of the pools are set up with mirrored vdevs and two vdevs to each pool.

A single vdev started showing this degraded drive yet there were no reported errors on the writes, reads, or checksums. I am a little stumped as they are new drives (the two in the degraded vdev) and there are no SMART errors on either drive.

I have attached pictures of the pool (wingclipper in the pictures), the pool status, as well as the SMART info regarding the two drives.

Also, I am currently trying to run a long SMART test on the first degraded drive to see if any further errors actually show up. Any ideas or help on what might be causing this would be wonderful. Could it be a cable, a backplane (as this is a Norco 4224 server case), or something else?? They are connected from the backplane to an LSI HBA. Once the SMART test finished I might turn off the box and reseat the drives to see if that may do it but I am unsure if it will help any.

ZeroGravitas23 · 5. Juni 2018

Just to clarify, during the entire time of trying to move from either OMV 2 or OMV 3, I personally did not have any drives connected during or after each installation minus of course the SSD I was trying to install OMV 4 onto.

I was attempting to install OMV 4 from a USB image onto an SSD with all of the extra data drives removed from the system. Upon installation and rebooting for the first time into the OS it would venture into GRUB and not find any OS drive to boot from even though the only drive available was the SSD that was just used to install OMV 4 onto.

And Domi, you are 100% correct that the first time when trying to go into the OS it does not use the UUID for the boot. It tried to use something along the lines of /dev/sda when really the correct drive was /dev/sdg1 (this also is a little odd since the OMV installation seems to have made several partitions on /dev/sdg and the boot existed precisely on sdg1.

So something during installation must be reverting the GRUB setup to using the /dev/*** nomenclature instead of using the UUID. I am not versed with GRUB or Linux enough to understand why Debian is doing this. But it is quite frustrating if you don't know how to navigate through GRUB well. Thank you again for all of the help so far. It has solved the problem for myself, I just wish we knew the exact cause in case other users see the same thing.

ZeroGravitas23 · 2. Juni 2018

You are 100% correct. After correcting the "Linux" line in the boot menu, booted up, and ran all of the available updates for the system it is now able to boot into OMV 4 successfully after rebooting.

Thank you for all of the help, and I learned something from this which is the best part.

So for anyone that seems to be having the same issue:

Edit GRUB where this is located linux /boot/vmlinuz-4.14 bla bla root=/dev/***
Replace the root = /dev/*** with the below
root = UUID = "Your UUID for the main OMV partition"

@ryecoaaron Any idea why this might be happening to some of us? The install seems to be choosing the incorrect partition when creating the initial boot sequence. Changing it to the precise UUID and then booting into OMV and running any system updates necessary seems to cement the UUID when corrected in GRUB first.

Just want to make sure other users are aware of how to fix this or why it might be happening.

ZeroGravitas23 · 1. Juni 2018

Thank you for the quick reply Bohatyr.

You steps worked like a charm (once I got the UUID correct haha).

I did notice that the change isn't persistent so how do we permanently change the boot so the UUID sticks? Do we just need to edit fstab?

This is what is shown on the fstab after successfully changing GRUB and booting into OMV:

Code

/etc/fstab: static file system information.                                                              
#                                                                                                      
# Use 'blkid' to print the universally unique identifier for a                                                                                                      
# device; this may be used with UUID= as a more robust way to name devices                                                                                  
# that works even if disks are added and removed. See fstab(5).                                                                                                      
# <file system> <mount point>   <type>  <options>       <dump>  <pass>                                                                                    
# / was on /dev/sdb1 during installation                                                                   
UUID=9ea1746d-8a08-49f3-8146-212c3b6264a4 /               ext4    errors=remount-ro 0       1 

# swap was on /dev/sdb5 during installation                                                                                                 
UUID=facef956-bc3f-406c-90f7-b76ed25c4e37 none            swap    sw              0      0                                                           


tmpfs           /tmp            tmpfs   defaults        0       0

Alles anzeigen

ZeroGravitas23 · 31. Mai 2018

quid7 and Bohatyr are correct from what I can see yet I am still having issues. When installing from a USB drive it boots to GRUB after the install. Going to edit the boot menu brings up what is below:

From this I can see that the UUID is not right at all. Before trying this install I grabbed the UUID of the drive the OS was to be installed on which is e984d5db-5116-4c31-bfe7-c2a39775f9eb. The UUID's don't match at all and the entire "Search" section of the code here is something I don't think should be there. So I tried removing that entire If --Else Statement, and edited the "Linux" line so the UUID is present instead of the /dev*. That can be seen below:

Yet, when trying to run the boot from this code another issue pops up. It cannot find the Linux set up:

I don't know why but during install it definitely is not setting up the boot sequence correct from what I can see. Any ideas on what can be tried to fix the boot?

ZeroGravitas23 · 17. Mai 2018

That'ts good to know. I am going to try and install a VM on my OMV3 box to see what happens if OMV4 is installed with the data array existing. Hopefully I will get time to test this in the next week or two and it could give us some insight as to what is going on.

ZeroGravitas23 · 16. Mai 2018

ryecoaaron, thanks for jumping in and giving us your thoughts on this. Any help is appreciated.

I know that the installation guide suggests removing all data drives when installing OMV 3 or 4 and it seems most of us tried to follow that.

Would you recommend trying the install OMV 4 with the entire array of drives (including the will be OS drive of course) present during the installation? Is there any reason the guide recommends removing the drives in that case other than accidently choosing a data drive as the installation location?

We could obviously try that within a VM pretty quickly if we wanted to as well.

ZeroGravitas23 · 11. Mai 2018

Thank you Markess and vialcollet for the responses.

That very well could be the root of the problem. I have a feeling when installing off a USB drive it some times changes any onboard drives to something other than SDA which obviously seems to mess the installation up once the USB is removed.

Does anyone know how to reassign the disk we want as the OS drive to SDA during the installation of OMV 4 off of a USB drive? I did not see an option come up to change that.

ZeroGravitas23 · 29. April 2018

I just wanted to comment on this as I also recently tried to move from OMV 2 to OMV 4. Using the base image of OMV 4 installed onto a SSD using a USB image I tried 3 or more times to install the OMV 4 image and I saw the same exact error from above. I had all of the data drives disconnected while installing the OS except for the 1 SSD that was going to be used as the OS drive. Also, that SSD was connected through Sata and not through any raid controllers.

One thing I think may be the problem which maybe someone above can help replicate is that OMV4 won't boot correctly it's first time if an onboard raid card is built into the motherboard. I believe this may be part of the problem because the motherboard I am using actually has a built in LSI raid controller which you can't disable from initializing on boot. Even if you disable it from initializing in the BIOS or loading an OS it will still try to initialize during boot.

After trying the installation several times I tried the OMV 3 image using a USB image and it booted up the first time with no issues. I am not sure why OMV 3 doesn't have the same issue as OMV 4 since it sounds like all of our installations were done using the same hardware as our previous versions of OMV.

Any help on why this may be an issue and how to solve it moving forward would be wonderful in case we want to upgrade to OMV 4.

ZeroGravitas23 · 8. Februar 2017

Thank you for your suggestion! It seems the IP I assigned the port changed by one for some reason. But entering that IP brings me right back to the web interface. Thank you for your help!

ZeroGravitas23 · 6. Februar 2017

I'm not sure what you mean unfortunately.

ZeroGravitas23 · 5. Februar 2017

I've had OMV2 running on Debian for about 6 months now with very few issues. A week or two ago I tried to access the WebGUI from another computer on the same LAN and wasn't able to find it. Since then I have never been able to connect to the WebGUI. Entering the IP that is assigned to it just comes up with a "This page does not exist" message.

I can tell OMV is still running though as SMB still works, and within the machine itself Emby and Snapraid still function perfectly fine. All of those were also set up within OMV using the plugins.

Since this a supermicro build there also is a second NIC which I can use to remote in using IPMI. That still functions properly as well.

Does anyone have any ideas as to what might be going on? I changed nothing within the software or hardware and it just became inaccessible out of nowhere.

ZeroGravitas23 · 26. August 2016

I finally figured out the issue and it was definitely due to the Snapraid repository.

It ended up being an PPA source which took a long time to find. But I removed it which did fix the plugin and update sections.

Thank you for the help.

ZeroGravitas23 · 25. August 2016

Ah, well that makes sense then. I installed it due to not knowing that plugins could be installed on OMV, again the first time I've installed it.

I'll remove snapraid from the system and try to re-update the source listing with your first comment.

Once I remove it I'll let you know how it looks. Thanks for all of the help!

ZeroGravitas23 · 25. August 2016

Alright, I have tried that it still gives me the same exact error in OMV under plugins and Update.

Now when I do that in the terminal and the system tries to update all of it's sources there are some errors which I was wondering is the underlying cause the issue.

Code

Get:1 http://ftp.us.debian.org wheezy Release.gpg [2,373 B]                    
Get:2 http://ftp.us.debian.org wheezy-updates Release.gpg [1,554 B]            
Get:3 http://security.debian.org wheezy/updates Release.gpg [1,554 B]          
Get:4 http://ftp.us.debian.org wheezy Release [191 kB]                         
Get:5 http://security.debian.org wheezy/updates Release [39.0 kB]              
Get:6 http://packages.openmediavault.org stoneburner Release.gpg [181 B]       
Ign http://ppa.launchpad.net wheezy Release.gpg                                
Get:7 http://ftp.us.debian.org wheezy-updates Release [151 kB]                 
Get:8 http://security.debian.org wheezy/updates/main amd64 Packages [428 kB]   
Get:9 http://ftp.us.debian.org wheezy/main amd64 Packages [5,839 kB]           
Ign http://ppa.launchpad.net wheezy Release                                    
Get:10 http://packages.openmediavault.org stoneburner Release [11.8 kB]        
Get:11 http://security.debian.org wheezy/updates/contrib amd64 Packages [14 B] 
Get:12 http://packages.openmediavault.org stoneburner/main amd64 Packages [7,100 B]
Get:13 http://security.debian.org wheezy/updates/non-free amd64 Packages [14 B]
Get:14 http://security.debian.org wheezy/updates/contrib Translation-en [14 B] 
Get:15 http://security.debian.org wheezy/updates/main Translation-en [236 kB]  
Get:16 http://security.debian.org wheezy/updates/non-free Translation-en [14 B]
Get:17 http://ftp.us.debian.org wheezy/contrib amd64 Packages [42.0 kB]        
Get:18 http://ftp.us.debian.org wheezy/non-free amd64 Packages [80.8 kB]       
Get:19 http://ftp.us.debian.org wheezy/contrib Translation-en [34.8 kB]        
Get:20 http://ftp.us.debian.org wheezy/main Translation-en [3,846 kB]          
Get:21 http://ftp.us.debian.org wheezy/non-free Translation-en [66.1 kB]       
Get:22 http://ftp.us.debian.org wheezy-updates/main amd64 Packages [7,047 B]   
Get:23 http://ftp.us.debian.org wheezy-updates/contrib amd64 Packages [14 B]   
Get:24 http://ftp.us.debian.org wheezy-updates/non-free amd64 Packages [488 B] 
Get:25 http://ftp.us.debian.org wheezy-updates/contrib Translation-en [14 B]   
Get:26 http://ftp.us.debian.org wheezy-updates/main Translation-en [4,879 B]   
Get:27 http://ftp.us.debian.org wheezy-updates/non-free Translation-en [496 B] 
Err http://ppa.launchpad.net wheezy/main Sources                               
  404  Not Found
Err http://ppa.launchpad.net wheezy/main amd64 Packages                        
  404  Not Found
Ign http://ppa.launchpad.net wheezy/main Translation-en_US                     
Ign http://ppa.launchpad.net wheezy/main Translation-en                        
Ign http://packages.openmediavault.org stoneburner/main Translation-en_US      
Ign http://packages.openmediavault.org stoneburner/main Translation-en         
Fetched 11.0 MB in 6s (1,677 kB/s)                                             
W: Failed to fetch http://ppa.launchpad.net/tikhonov/snapraid/ubuntu/dists/wheezy/main/source/Sources  404  Not Found


W: Failed to fetch http://ppa.launchpad.net/tikhonov/snapraid/ubuntu/dists/wheezy/main/binary-amd64/Packages  404  Not Found


E: Some index files failed to download. They have been ignored, or old ones used instead.

Alles anzeigen

So it looks like the Wheexy sources and the snapraid sources cannot be found. I am not sure if this is enough to cause the index issue with OMV.

Any ideas?

ZeroGravitas23 · 24. August 2016

Thanks for the clarification. Wasn't sure what the 3.2.0.4 meant.

I do not have omv-extras installed. That's what I was going to install when I noticed the plugins and update sections weren't working.

I'll try you're other suggestion when I get home in a few hours and let you know what the outcome is.

ZeroGravitas23 · 24. August 2016

Hey Everyone, I'm new to OMV as well as linux although I have used unix for quite some time.

I have a media server with Debian(Wheexy) installed as well as mergerfs, snapraid, and OMV. It seems to be having issues updating the plugins list as well as when I go to the Update section in the GUI.

I'm using Stoneburner 2.2.6 and the latest release of Debian 7.11(listed as 3.2.0.4 in OMV GUI).

Below is the output of the error when I try to access the Plugin section or the Update section in the web GUI. Any help with this would be greatly appreciated!

"The index of available plugins does not exist. Please re-synchronize the package index files from their sources.

Error #6000:exception 'OMVException' with message 'The index of available plugins does not exist. Please re-synchronize the package index files from their sources.' in /usr/share/openmediavault/engined/rpc/pluginmgmt.inc:73Stack trace:
#0 [internal function]: OMVRpcServicePluginMgmt->enumeratePlugins(Array, Array)
#1 /usr/share/php/openmediavault/rpcservice.inc(125): call_user_func_array(Array, Array)
#2 /usr/share/php/openmediavault/rpc.inc(79): OMVRpcServiceAbstract->callMethod('enumeratePlugin...', Array, Array)
#3 /usr/sbin/omv-engined(500): OMVRpc::exec('Plugin', 'enumeratePlugin...', Array, Array, 1)
#4 {main}"

Beiträge von ZeroGravitas23

ZFS Zpool Status Degraded with no ZFS errors or SMART Failures

ZFS Zpool Status Degraded with no ZFS errors or SMART Failures

ZFS Zpool Status Degraded with no ZFS errors or SMART Failures

mdadm: no arrays found in config file or automatically

mdadm: no arrays found in config file or automatically

mdadm: no arrays found in config file or automatically

mdadm: no arrays found in config file or automatically

mdadm: no arrays found in config file or automatically

mdadm: no arrays found in config file or automatically

mdadm: no arrays found in config file or automatically

mdadm: no arrays found in config file or automatically

No Access of OMV2 WebGUI

No Access of OMV2 WebGUI

No Access of OMV2 WebGUI

Problems with Plugin list and Update Section on new OMV install

Problems with Plugin list and Update Section on new OMV install

Problems with Plugin list and Update Section on new OMV install

Problems with Plugin list and Update Section on new OMV install

Problems with Plugin list and Update Section on new OMV install