Purposely did the apt dist-upgrade, and followed the steps in the link by candymirror above. That also fixed my issues, seems everything is running smoothly now.
Posts by Flaschie
-
-
Similar issue, I've been getting cron-apt mail like the following for days now (it's the same everyday, so I guess zfs-dkms is not upgraded).
Code
Display MoreCalculating upgrade... The following packages were automatically installed and are no longer required: dkms libelf-dev libnvpair1linux libuutil1linux libzfs2linux libzpool2linux linux-headers-amd64 spl-dkms zfs-dkms Use 'apt autoremove' to remove them. The following packages will be REMOVED: openmediavault-zfs sysv-rc zfs-zed zfsutils-linux The following NEW packages will be installed: libeinfo1 librc1 openrc The following packages will be upgraded: zfs-dkms 1 upgraded, 3 newly installed, 4 to remove and 0 not upgraded.
I just tried the omv-aptclean, the web GUI seems to be out of sync in regards to updates (e.g. it lists a lot of updates, but I cannot install them due to dependency issues):
PS: Let me know if I should translate some of the below, it's basically complaining about openrc needs insserv, and in conflict with sysv-rc and sysv-rc needs insserv. I have insserv 1.14, so not sure what it is complaining about....Code
Display More>>> *************** Error *************** Failed to execute command 'export PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin; export LANG=C.UTF-8; export DEBIAN_FRONTEND=noninteractive; apt-get --yes --allow-downgrades --allow-change-held-packages --fix-broken --fix-missing --auto-remove --allow-unauthenticated --show-upgraded --option DPkg::Options::="--force-confold" install libeinfo1 librc1 openmediavault-zfs openrc sysv-rc zfs-dkms zfs-zed zfsutils-linux 2>&1' with exit code '100': Leser pakkelister... Skaper oversikt over avhengighetsforhold... Leser tilstandsinformasjon... sysv-rc er allerede den nyeste version (2.88dsf-59.9). openmediavault-zfs er allerede den nyeste version (4.0.4). Noen pakker ble ikke installeres. Dette kan bety at du har bedt om en umulig tilstand eller, hvis du bruker den ustabile utgaven av Debian, at visse kjernepakker ennå ikke er laget eller flyttet ut av «Incoming» for distribusjonen. Følgende informasjon kan være til hjelp med å løse problemet: Følgende pakker har uinnfridde avhengighetsforhold: openrc : Avhenger av: insserv men skal ikke installeres Er i konflikt med: sysv-rc men 2.88dsf-59.9 skal installeres sysv-rc : Avhenger av: insserv (> 1.12.0-10) men skal ikke installeres E: Klarer ikke å rette problemene, noen ødelagte pakker er holdt tilbake. <<< *************************************
If I try an apt-upgrade in SSH, it is not installing the zfs (they are held back, sorry for the lingo again but I presume the syntax is understood):
-
Can also confirm been having this "error" for years. It comes when I look at the SMART data in the GUI (haven't tested cli), and the error is ABRT with the smart read/write as you've shown. It happens on my WD Reds, my WD Re and my WD AV-GP. And my 8 disks are still alive, with the exceptions of those who died
(although, no reason to correlate that to the error in question ;))
-
I just fixed a similar error for a client with win10 pro, check this thread, maybe same issue? Answer should lie in the second last post (that's what I did to fix it as well)
-
This is probably not directly linked to OMV, but as there are many wise people here maybe someone can help or has experienced the same
My system/power LED-indicator used to blink in OMV3 when the NAS entered suspend-mode (Autoshutdown plugin: pm-suspend). After upgrading to OMV4, this does no longer happen (I only recently upgraded). The LED is constantly on, and this is off course very annoying as I'm having problems identifying whether the NAS is on or not. As far as my googling goes, this is not software related, but something that is set in BIOS. However, as stated, it did work in OMV3, so there must have been a change somewhere. I have tried looking through the BIOS, and I cannot not find anything. I also updated to latest BIOS without success. My motherboard is a Supermicro X9SCM-F.
Anyone having any tips how to get the blinking back? I'm very close to keep my NAS on 24/7 just to avvoid this........
-
I guess not. From my understanding, there is no real raid 10 in zfs. You can create mirrors and then stripe them but I guess we don't allow you to use a drive in use anymore (I think I actually changed that). You could create the mirrors in the web interface and then create the stripe from the command line and import it in the plugin. Having to do advanced things from the command line is not unreasonable. Not *every* thing can be available from the web interface. It makes it too complicated.
Wouldn't creating a mirror pool, and then in the web interface "Expand" it with another mirror create a RAID10-equivalent ZFS, i.e. striped mirror vdevs? I believe I did this when I created my pool some time ago (things have happened to the plug-in since then). -
Regarding #3: It's not really a fix to your problem, but I run this script regulary in addition to ZED (mostly because I found this script before I learned about ZED, but since ZFS is all about safekeeping, why not have redundancy in the error checkers?
) The script is from: https://calomel.org/zfs_health_check_script.html
I have modified it to suit my needs and also to make it work properly (email), so maybe you would like to have a different version. You should change the <OMV_NAME> in the emails to your servers name. (Note, I may have forgotten to mark all my changes with the double #)
Bash
Display More#! /bin/sh # # Calomel.org # https://calomel.org/zfs_health_check_script.html # FreeBSD ZFS Health Check script # zfs_health.sh @ Version 0.17 #Updates by Arve: ## ##Removed `hostname` from start of emailSubject as OMV does this automatically ##Changed email from <root@<OMV_NAME>.localdomain> to <root@<OMV_NAME>.LAN> to fit OMV3 (a bug maybe? missing environmental vars? also working: root, name.nameson@domain.end) # Check health of ZFS volumes and drives. On any faults send email. # 99 problems but ZFS aint one problems=0 # Health - Check if all zfs volumes are in good condition. We are looking for # any keyword signifying a degraded or broken array. condition=$(/sbin/zpool status | egrep -i '(DEGRADED|FAULTED|OFFLINE|UNAVAIL|REMOVED|FAIL|DESTROYED|corrupt|cannot|unrecover)') if [ "${condition}" ]; then emailSubject="ZFS pool - HEALTH fault" problems=1 fi ##Moved capacity check to bottom (don't need capacity warnings if pool errors exists...) # Errors - Check the columns for READ, WRITE and CKSUM (checksum) drive errors # on all volumes and all drives using "zpool status". If any non-zero errors # are reported an email will be sent out. You should then look to replace the # faulty drive and run "zpool scrub" on the affected volume after resilvering. if [ ${problems} -eq 0 ]; then errors=$(/sbin/zpool status | grep ONLINE | grep -v state | awk '{print $3 $4 $5}' | grep -v 000) if [ "${errors}" ]; then emailSubject="ZFS pool - Drive Errors" problems=1 fi fi # Scrub Expired - Check if all volumes have been scrubbed in at least the last # 8 days. The general guide is to scrub volumes on desktop quality drives once # a week and volumes on enterprise class drives once a month. You can always # use cron to schedual "zpool scrub" in off hours. We scrub our volumes every # Sunday morning for example. # # Scrubbing traverses all the data in the pool once and verifies all blocks can # be read. Scrubbing proceeds as fast as the devices allows, though the # priority of any I/O remains below that of normal calls. This operation might # negatively impact performance, but the file system will remain usable and # responsive while scrubbing occurs. To initiate an explicit scrub, use the # "zpool scrub" command. # # The scrubExpire variable is in seconds. So for 8 days we calculate 8 days # times 24 hours times 3600 seconds to equal 691200 seconds. ##scrubExpire=691200 ## ##if [ ${problems} -eq 0 ]; then ## currentDate=$(date +%s) ## zfsVolumes=$(/sbin/zpool list -H -o name) ## ## for volume in ${zfsVolumes} ## do ## if [ $(/sbin/zpool status $volume | egrep -c "none requested") -ge 1 ]; then ## printf "ERROR: You need to run \"zpool scrub $volume\" before this script can monitor the scrub expiration time." ## break ## fi ## if [ $(/sbin/zpool status $volume | egrep -c "scrub in progress|resilver") -ge 1 ]; then ## break ## fi ## ## ### Ubuntu with GNU supported date format ## #scrubRawDate=$(/sbin/zpool status $volume | grep scrub | awk '{print $11" "$12" " $13" " $14" "$15}') ## #scrubDate=$(date -d "$scrubRawDate" +%s) ## ## ### FreeBSD with *nix supported date format ## scrubRawDate=$(/sbin/zpool status $volume | grep scrub | awk '{print $15 $12 $13}') ## scrubDate=$(date -j -f '%Y%b%e-%H%M%S' $scrubRawDate'-000000' +%s) ## ## if [ $(($currentDate - $scrubDate)) -ge $scrubExpire ]; then ## emailSubject="ZFS pool - Scrub Time Expired. Scrub Needed on Volume(s)" ## problems=1 ## fi ## done ##fi # Capacity - Make sure the pool capacity is below 80% for best performance. The # percentage really depends on how large your volume is. If you have a 128GB # SSD then 80% is reasonable. If you have a 60TB raid-z2 array then you can # probably set the warning closer to 95%. # # ZFS uses a copy-on-write scheme. The file system writes new data to # sequential free blocks first and when the uberblock has been updated the new # inode pointers become valid. This method is true only when the pool has # enough free sequential blocks. If the pool is at capacity and space limited, # ZFS will be have to randomly write blocks. This means ZFS can not create an # optimal set of sequential writes and write performance is severely impacted. maxCapacity=85 if [ ${problems} -eq 0 ]; then capacity=$(/sbin/zpool list -H -o capacity | cut -d'%' -f1) for line in ${capacity} do if [ $line -ge $maxCapacity ]; then emailSubject="ZFS pool - Capacity Exceeded" zpool list | mail -s "$emailSubject" root@<OMV_NAME>.LAN ##Added email here for capacity issues (use "zpool list" in this email) fi done fi # Email - On any problems send email with drive status information and # capacities including a helpful subject line. Also use logger to write the # email subject to the local logs. This is also the place you may want to put # any other notifications like playing a sound file, beeping the internal # speaker, paging someone or updating Nagios or even BigBrother. if [ "$problems" -ne 0 ]; then ##OMV mail zpool status -v | mail -s "$emailSubject" root@<OMV_NAME>.LAN ##printf '%s\n' "$emailSubject" "" "`/sbin/zpool list`" "" "`/sbin/zpool status`" | /usr/bin/mail -s "$emailSubject" root@localhost ##logger $emailSubject fi ##log logger "ZFS Health check completed" ### EOF ###
-
I'm not sure this is related to ZFS directly, but is it normal to see better sequential write speeds than read speeds on a mirrored vdev pool?
It seems my server struggles when it comes to reads sometimes, as the speed is very fluctuating even if I'm just copying large files (to an SSD on the client-side). My pool consists of 2 mirror vdevs made up by 2 WD Reds 3TB. I have compression turned on and have adjusted the ARC to 12 GB max (and 4 GB min). In the attached figures, you can see the write speed when copying some VMs to the NAS showing a more or less constant 445 MB/s. The data adds up to 40 GB (23 GB after being compressed, i.e. disk usage is 23 GB on the NAS). When trying to read back the very same files, I see a very fluctuating graph topping out at about 340 MB/s. Now, even though the data in this example are compressible, I see the same behavior for in-compressible data. When looking at CPU-usage, I tend to see this large portion waiting for IO, and my CPU-load may be very high, (especially if trying to to several copy tasks, I have seen load numbers in the 20-30s).
I also noticed that disk usage in iostat (%util) is close to 100 % when writing, but only about 60 % when reading, so it seems the disks are not able to fully perform.
My server has a Xeon E3-1240, 16 GB of RAM, a Mellanox 10 Gb network card (same on client side) and running Erasmus (OMV3). I use a LSI SAS 9210 for half the disks, the other half is run from the MB (to have controller redundancy as well, as a side note can this cause the strange behavior?).
-
The best strategy is up to you to decide
Maybe you can start looking at the 3-2-1 way of having backups; which basically means to have (at least) 3 copies of your data on 2 different devices with 1 device beeing offsite, see https://www.backblaze.com/blog/the-3-2-1-backup-strategy/
I'm using rsnapshot without any problems, but I am not copying on the same disk. The first time it runs it will run for a long time as it needs to create a copy of all your data. Then, the next times, it will run much faster as it will only need to copy any changed files. Even if assuming a speed of 100 MB/s, which is a bit high for internal copy on a single harddrive (i.e. not SSD), you will be copying for about 1.5 hours given 500 GB of data. So it is not strange that rsnapshot used hours to complete.
I dont share my snapshot folder in Samba, if I need any files I log into my server using SSH and copy the files I want. I also have local and offsite backups to be able to restore my data in case of disk failure, fire, burglory or other inconveniences that may occur
-
Same here, thanks a lot!
-
Windows 10.... I've not had any issues with either reboot or shutdown as far as I can remember (although that just means no problems since the day before yesterday...)
-
Thanks, no worries! Everything is working otherwise, so I'm not affected by the issue in a critcal manner (except my strange need for looking at statistics
)
-
Thanks for the update, ryecoaaron and volker!
But it seems the fix has made only one of my four ZFS-fs to become visible in FileSystems-tab and Performance Statistics-tab. Any chance this is a bug in the fix, or have I done something stupid on my end? The disk-usage statistics did work before the update(s).
Some screenshots showing the issue:
-
Not to ruin your evening as well, but I just performed an successful reboot using Vivaldi
(using 1.9.818.44 (Stable channel) (64-bit) is you're curious)
-
"Self-test execution status: ( 241) Self-test routine in progress...
10% of test remaining."Seems like you were a bit early with your smartctl -a
-
I see, thanks!
-
I just made a clean OMV3 install, and managed to get ZFS up and running again after installing backports kernel in OMV extras. But my "Detail" window for my ZFS-pools are light grey / low contrast (other windows in the web-gui show correct contrast, and so did this window when I had OMV2 installed).
Couldn't find anyone else asking about this before; is this normal or is there anything I can do?
-
Stumbled across this thread looking for answer to the same question: Should I not receive an email when the disk(s) reach the informal level? I have similar settings, my smartd looks similar (I have a difference at -s, mine is S/../../x/0y only, I have no L).
I get email notifications in general, and I have recieved other SMART messages (i.e. SMART error messages). I get the information regarding temperature exceedance in the SMART log, just not getting any mail... I just tried setting the critical level below my current drive temperature, and I instantly received emails regarding the critical limit, so it is only the informal part which is not working for me.
-
Hi,
It seems I manage to give the autoshutdown-script some headache when I define an uptime range of 9..2 (i.e. 9AM to 2AM). I think this "bug" will be present for other combos as well when the end-time number is less than typical up-time numbers. So say if I set it to 9-23, the sleep function when encounting up-time period works perfectly, but whith 9-2 I get this repated in my log so many times it's difficult to find the interesting stuff (if any
) :
Code
Display Moreautoshutdown[26927]: INFO: ' new supervision cycle started - check active hosts or processes' autoshutdown[26927]: INFO: ' Checking the time: stay up or shutdown ...' autoshutdown[26927]: INFO: ' System is in Stayup-Range. No need to do anything. Sleeping ...' autoshutdown[26927]: INFO: ' Sleeping until 1:55 -> 0 seconds' autoshutdown[26927]: INFO: ' sleep for 180s.' autoshutdown[26927]: INFO: '------------------------------------------------------' autoshutdown[26927]: INFO: ' new supervision cycle started - check active hosts or processes' autoshutdown[26927]: INFO: ' Checking the time: stay up or shutdown ...' autoshutdown[26927]: INFO: ' System is in Stayup-Range. No need to do anything. Sleeping ...' autoshutdown[26927]: INFO: ' Sleeping until 1:55 -> 0 seconds' autoshutdown[26927]: INFO: ' sleep for 180s.' autoshutdown[26927]: INFO: '------------------------------------------------------' autoshutdown[26927]: INFO: ' new supervision cycle started - check active hosts or processes' autoshutdown[26927]: INFO: ' Checking the time: stay up or shutdown ...' autoshutdown[26927]: INFO: ' System is in Stayup-Range. No need to do anything. Sleeping ...' autoshutdown[26927]: INFO: ' Sleeping until 1:55 -> 0 seconds' autoshutdown[26927]: INFO: ' sleep for 180s.' autoshutdown[26927]: INFO: '------------------------------------------------------' autoshutdown[26927]: INFO: ' new supervision cycle started - check active hosts or processes' autoshutdown[26927]: INFO: ' Checking the time: stay up or shutdown ...' autoshutdown[26927]: INFO: ' System is in Stayup-Range. No need to do anything. Sleeping ...' autoshutdown[26927]: INFO: ' Sleeping until 1:55 -> 0 seconds' autoshutdown[26927]: INFO: ' sleep for 180s.'
Basically, it spams my log with meaningless info
However, it should still shutdown my NAS when in shutdown-range, so it's really not that big of a problem; the script still works. But is it not possible to avvoid this somehow? It seems the error is the negative TIMETOSLEEP?
Codeautoshutdown[32297]: DEBUG: 'FAKE-Mode: _check_clock(): CLOCKCHECK: 20; CLOCKSTART: 9 ; CLOCKEND: 2 -> forced to stay up' autoshutdown[32297]: DEBUG: 'FAKE-Mode: _check_clock(): TIMETOSLEEP: -19' autoshutdown[32297]: DEBUG: 'FAKE-Mode: _check_clock(): SECONDSTOSLEEP: 0' autoshutdown[32297]: DEBUG: 'FAKE-Mode: _check_clock(): MINUTESTOSLEEP: ' autoshutdown[32297]: DEBUG: 'FAKE-Mode: _check_clock(): Final: SECONDSTOSLEEP: 0' autoshutdown[32297]: DEBUG: 'FAKE-Mode: _check_clock(): TIMEHOUR: - TIMEMINUTES: ' autoshutdown[32297]: INFO: 'FAKE-Mode: System is in Stayup-Range. No need to do anything. Sleeping ...' autoshutdown[32297]: INFO: 'FAKE-Mode: Sleeping until : -> 0 seconds'
-
Working perfectly, thanks man!