[HOWTO] Instal ZFS-Plugin & use ZFS on OMV

    • OMV 1.0
    • ellnic wrote:


      The lastest version of OMV isn't 1.9. It's 2.x. In fact, as of yesterday, 2.0.7. openmediavault.org/?p=1682

      I'd update and see if this fixes your issue.
      You should not upgrade to 2.x if you use the zfs plugin. The GUI does not work in extjs5 which is used in 2.x. When I press ZFS tab in the left menu nothing happens. To me it seems like the gui is hanging somewhere and just times out.
    • bobafetthotmail wrote:

      He is talking of hardware specs, not rumors.


      I wasn't saying that it were rumors.

      bobafetthotmail wrote:

      Every 12 TB read you get something that is corrupted.


      I have to apologize, I thought about Gigabytes earlier, not TB. But don't forget that you read from multiple disks, not a single one.

      bobafetthotmail wrote:

      Good controllers stop and tell you of the issue so you can move the data off it, bad controllers ignore it and corrupt the whole array.


      The question is, does that happen on every 12TB? That would technically mean that on nearly every scrub I would destroy a sector on my array.

      Greetings
      David
      "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"

      Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.


      Upload Logfile via WebGUI/CLI
      #openmediavault on freenode IRC | German & English | GMT+1
      Absolutely no Support via PM!

      I host parts of the omv-extras.org Repository, the OpenMediaVault Live Demo and the pre-built PXE Images. If you want you can take part and help covering the costs by having a look at my profile page.
    • davidh2k wrote:

      The question is, does that happen on every 12TB? That would technically mean that on nearly every scrub I would destroy a sector on my array.
      That's usually just data loss, not hardware damage (although a failed sector is also data loss of course).
      Look at the SMART of the disks, they log it as Reported Uncorrectable Errors, ID 187.

      Like all probabilistic events it is wildly mercurial, you might get 10 in a row within a minute or none whatsoever for years.
    • bobafetthotmail wrote:

      Look at the SMART of the disks, they log it as Reported Uncorrectable Errors, ID 187.


      Source Code

      1. root@Chap /opt/postprocessscripts # smartctl -a -d 3ware,0 /dev/twa0
      2. smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.16.0-0.bpo.4-amd64] (local build)
      3. Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
      4. === START OF INFORMATION SECTION ===
      5. Model Family: Western Digital Red
      6. Device Model: WDC WD30EFRX-68AX9N0
      7. Serial Number:
      8. LU WWN Device Id:
      9. Firmware Version: 80.00A80
      10. User Capacity: 3.000.592.982.016 bytes [3,00 TB]
      11. Sector Sizes: 512 bytes logical, 4096 bytes physical
      12. Device is: In smartctl database [for details use: -P show]
      13. ATA Version is: 9
      14. ATA Standard is: Exact ATA specification draft version not indicated
      15. Local Time is: Wed May 13 01:20:28 2015 CEST
      16. SMART support is: Available - device has SMART capability.
      17. SMART support is: Enabled
      18. ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
      19. 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
      20. 3 Spin_Up_Time 0x0027 180 177 021 Pre-fail Always - 6000
      21. 4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1095
      22. 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
      23. 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
      24. 9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 4644
      25. 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
      26. 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
      27. 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1095
      28. 192 Power-Off_Retract_Count 0x0032 199 199 000 Old_age Always - 1094
      29. 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 0
      30. 194 Temperature_Celsius 0x0022 115 111 000 Old_age Always - 35
      31. 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
      32. 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
      33. 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
      34. 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
      35. 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0
      Display All


      Seems like WD Reds do not have that ID active.

      Greetings
      David
      "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"

      Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.


      Upload Logfile via WebGUI/CLI
      #openmediavault on freenode IRC | German & English | GMT+1
      Absolutely no Support via PM!

      I host parts of the omv-extras.org Repository, the OpenMediaVault Live Demo and the pre-built PXE Images. If you want you can take part and help covering the costs by having a look at my profile page.
    • Can't execute smartctl like that because I'm using a 3ware controller and the drives are not exposed as /dev/sdx.

      Greetings
      David
      "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"

      Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.


      Upload Logfile via WebGUI/CLI
      #openmediavault on freenode IRC | German & English | GMT+1
      Absolutely no Support via PM!

      I host parts of the omv-extras.org Repository, the OpenMediaVault Live Demo and the pre-built PXE Images. If you want you can take part and help covering the costs by having a look at my profile page.
    • nicjo814 wrote:

      I don't have a solution for this unfortunately. I would guess that the timeout is specified somewhere in the "core" code of OMV. I'll have to test setting up a similar scenario as you describe in a virtualized environment to see if I can reproduce the problem. However I won't be able to do this next couple of days.


      Thank you. I'll wait for your test result. In the mean time I've gone crazy and went and (stab in the dark) changed any timeout value found in the *php.ini and nginx / omv I could find to 300 (seconds). Unfortunately this didn't help :(

      I've got over 300 dataset and most many of them have tens of snapshots. The current timeout is not enough to build this list.
    • nicjo814 wrote:

      I don't have a solution for this unfortunately. I would guess that the timeout is specified somewhere in the "core" code of OMV.
      While this isn't my field nor I can say I know OMV, I did slam my face on similar issues somewhere else.

      I'm suspecting that his issue is similar to this one Constant WebGUI timeout errors
      Which I suspect has something to do with nginx's proxy_pass settings, i.e. the fact that the webUI is communicating with other webUIs or local proxies to run for various components and windows and so on.
      It's not monolithic.
      This is an example of what I said, being used for direct accessing the daemons or components. Daemon webUI access through prefix/hostname [Nginx] [Proxy_pass]

      if you see the logs in that thread, there is a "server" called openmediavault-gui (obvious function) and a client called ::ffff:10.0.0.113, which is a local address. And there is a timeout in the communication between them, that causes the failure of the webinterface.

      Can someone that has this "too much stuff, loading fails" issue go and fetch his nginx error logs?
      It would be useful to confirm my suspects.
      Since I have no idea of where is the main log and where are (if there are) secondary logs,
      do this to find them

      Source Code

      1. ps axu | grep nginx
      should give you the numeric ID of nginx process,

      Source Code

      1. lsof -p NUMERIC_ID | grep log
      of course write the numeric ID you found earlier, not the NUMERIC_ID which is just a placeholder.

      Should return all logfiles opened by nginx, so all logfiles that may have errors in them.

      Source Code

      1. cat /path/to/file
      to show contents in terminal.

      Main log file is of course the most likely to have these errors, but I don't know.
      Not near an OMV box atm so I cannot go look at mine.

      Also the openmediavault-webgui_error.log

      Possible solutions are about setting nginx to increase timeout of the proxy istances, because if you change timeout of the server instance (what you did with the setting you changed and the same thing done in that thread) does not carry over to the proxy or the reverse.
      see here beyondlinux.com/2012/01/28/sol…inx-504-gateway-time-out/
      or here howtounix.info/howto/110-connection-timed-out-error-in-nginx

      While I don't encourage anyone to jump the gun and try these settings blind, if you feel brave or foolish enough (or have backups and the time to restore them) that's another piece of evidence for my suspects.

      While I'm probably unable to fix this myself in the source (to "fix" this properly, someone has to add logic coding that checks for folders and increases timeout settings accordingly, I doubt that placing everyone at 9999999999 timeout flat for these things is a good practice, I don't know OMV enough to do that atm), if there is some evidence that this is the culprit I (or anyone) can go and open a bug report in the bugtracker so Volker can act and fix it.

      And of course maybe sticky a thread with the issue and the manual fix until this gets patched.

      The post was edited 1 time, last by bobafetthotmail ().

    • See attached video. You can see the name scroll across the field. Each time (tried 5 times), it starts an fpm thread that chews up a lot of cpu until a timeout or even a segfault happen. I assume it happens because each time the name moves, it is calling a php function. This is a problem with the plugin and omv 2.x/extjs 5.1.

      Also, if I remember correctly, the filesystem backend is a little slow. So, the more zfs nodes (or whatever) you have, the longer it takes and can timeout if enough of them.
      Files
      • zfs.zip

        (755.59 kB, downloaded 127 times, last: )
      omv 4.1.11 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.11
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!
    • So I have been following this guide and mistakenly I added the drive by path. I'm seeing the disappearing drives issue after a reboot. When I run zpool export poolname and then zpool import poolname, I do get my drives back, but then a status shows them as still being referenced by path. Is there a way to "alias" /dev/sdx with a certain uuid everytime, or should I backup the data and then recreate the pool?

      Thanks in advance!
    • bobafetthotmail, I'll have a look at your suggestion in details when I get a chance. I can afford to lose this data so that is not a concern.

      You can see the name scroll across the field. Each time (tried 5 times), it starts an fpm thread that chews up a lot of cpu until a timeout or even a segfault happen. I assume it happens because each time the name moves, it is calling a php function. This is a problem with the plugin and omv 2.x/extjs 5.1.


      You may be onto something there, sometime when I try to add a share folder the console comes up with the below:

      Source Code

      1. Message from syslogd@adm-nas-svr at May 15 10:15:35 ...
      2. kernel:[231479.736595] VERIFY3(nvlist_pack(nvl, &packed, sizep, 0, 0x0000) == 0) failed (14 == 0)
      3. Message from syslogd@adm-nas-svr at May 15 10:15:35 ...
      4. kernel:[231479.736635] PANIC at fnvpair.c:81:fnvlist_pack()


      I look at the nginx but could not see anything relevant all I can see in the file @/var/log/nginx/openmediavault-webgui_access.log (all other nginx log file are empty) are below

      Source Code

      1. [20/May/2015:14:53:38 +1000] "GET /js/omv/window/Login.js HTTP/1.1" 200 3987 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0)
      2. [20/May/2015:14:53:38 +1000] "GET /js/omv/util/i18nDict.js HTTP/1.1" 200 1833495 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37
      3. [20/May/2015:14:53:38 +1000] "GET /extjs/resources/ext-theme-classic/ext-theme-classic-all.css HTTP/1.1" 200 277860 "http://192.168.x.y/extjs/res
      4. [20/May/2015:14:53:38 +1000] "GET /extjs/resources/ext-theme-gray/ext-theme-gray-all.css HTTP/1.1" 200 269048 "http://192.168.x.y/extjs/resources
      5. [20/May/2015:14:53:38 +1000] "GET /extjs/ext-all-debug.js HTTP/1.1" 200 3699363 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.
      6. [20/May/2015:14:53:38 +1000] "GET /images/header_logo.png HTTP/1.1" 200 8413 "http://192.168.x.y/css/omv.css" "Mozilla/5.0 (Windows NT 6.3; WOW64
      7. [20/May/2015:14:53:38 +1000] "GET /extjs/resources/ext-theme-gray/images/form/trigger.gif HTTP/1.1" 200 1080 "http://192.168.x.y/extjs/resources/
      8. [20/May/2015:14:53:38 +1000] "GET /extjs/resources/ext-theme-gray/images/form/exclamation.gif HTTP/1.1" 200 996 "http://192.168.x.y/extjs/resourc
      9. [20/May/2015:14:53:38 +1000] "GET /extjs/resources/ext-theme-gray/images/form/text-bg.gif HTTP/1.1" 200 819 "http://192.168.x.y/extjs/resources/e
      10. [20/May/2015:14:53:39 +1000] "GET /extjs/resources/ext-theme-gray/images/grid/invalid_line.gif HTTP/1.1" 200 815 "http://192.168.x.y/extjs/resour
      11. [20/May/2015:14:53:54 +1000] "GET /extjs/resources/ext-theme-gray/images/grid/loading.gif HTTP/1.1" 200 771 "http://192.168.x.y/extjs/resources/e
      12. [20/May/2015:14:53:54 +1000] "POST /rpc.php HTTP/1.1" 200 78 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/20100101 F
      13. [20/May/2015:14:53:55 +1000] "GET / HTTP/1.1" 200 2808 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0"
      14. [20/May/2015:14:53:55 +1000] "GET /extjs/resources/css/ext-all.css HTTP/1.1" 200 57 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv
      15. [20/May/2015:14:53:55 +1000] "GET /js/omv/util/i18n.js HTTP/1.1" 200 2447 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gec
      16. [20/May/2015:14:53:55 +1000] "GET /css/omv.css HTTP/1.1" 200 9293 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/20100
      17. [20/May/2015:14:53:55 +1000] "GET /extjs/resources/css/ext-all-gray.css HTTP/1.1" 200 51 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW6
      18. [20/May/2015:14:53:55 +1000] "GET /js/ext-overrides.js HTTP/1.1" 200 28299 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Ge
      19. [20/May/2015:14:53:55 +1000] "GET /js/omv/globals.js HTTP/1.1" 200 2099 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko
      20. [20/May/2015:14:53:55 +1000] "GET /js/omv/util/Format.js HTTP/1.1" 200 6859 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) G
      21. [20/May/2015:14:53:55 +1000] "GET /js/omv/window/MessageBox.js HTTP/1.1" 200 10110 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:
      22. [20/May/2015:14:53:55 +1000] "GET /js/js-overrides.js HTTP/1.1" 200 6296 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Geck
      23. [20/May/2015:14:53:55 +1000] "GET /js/omv/window/Window.js HTTP/1.1" 200 1118 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0)
      24. [20/May/2015:14:53:55 +1000] "GET /js/omv/Rpc.js HTTP/1.1" 200 8808 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/201
      25. [20/May/2015:14:53:55 +1000] "GET /js/omv/window/Execute.js HTTP/1.1" 200 11802 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.
      26. [20/May/2015:14:53:55 +1000] "GET /js/omv/data/reader/RpcJson.js HTTP/1.1" 200 1127 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv
      27. [20/May/2015:14:53:55 +1000] "GET /js/omv/data/proxy/Rpc.js HTTP/1.1" 200 3805 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0
      28. [20/May/2015:14:53:55 +1000] "GET /js/omv/data/Model.js HTTP/1.1" 200 1218 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Ge
      29. [20/May/2015:14:53:55 +1000] "GET /js/omv/data/Store.js HTTP/1.1" 200 1711 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Ge
      30. [20/May/2015:14:53:55 +1000] "GET /js/omv/workspace/window/plugin/ConfigObject.js HTTP/1.1" 200 3374 "http://192.168.x.y/" "Mozilla/5.0 (Windows
      31. [20/May/2015:14:53:55 +1000] "GET /js/omv/grid/Panel.js HTTP/1.1" 200 5067 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Ge
      32. [20/May/2015:14:53:55 +1000] "GET /js/omv/workspace/window/Container.js HTTP/1.1" 200 12364 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; W
      33. [20/May/2015:14:53:55 +1000] "GET /js/omv/workspace/window/Grid.js HTTP/1.1" 200 3209 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64;
      34. [20/May/2015:14:53:55 +1000] "GET /js/omv/form/Panel.js HTTP/1.1" 200 3604 "http://192.168.x.y/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Ge
      Display All


      The FileSystem tab seems to be able to list all the zfs filesystem after about 60 seconds. Is there a way to change the add share section to not parse the zfs snapshot. This should make it much less to parse through.

      Another thing is the ZFS tab does not list my zpool! (just empty as if zpool are available)

      The post was edited 1 time, last by orionweb ().

    • Now I'm just thinking with my problem. I'm guessing that the create share page tries to list all the filesystem using something like "zfs list -t snapshot" which would list all the zfs filesystem plus the snapshots.

      If that is the case is it possible for me to temporarily change it so that it only list the filesystem and not the snapshots? If I can do this then it will get me going creating the share. I only need to create the share once off and and not need to touch it again for quite a while.

      If It is possible to do the above where would I find this bit of code / line to change this?
    • orionweb wrote:

      Now I'm just thinking with my problem. I'm guessing that the create share page tries to list all the filesystem using something like "zfs list -t snapshot" which would list all the zfs filesystem plus the snapshots.

      If that is the case is it possible for me to temporarily change it so that it only list the filesystem and not the snapshots? If I can do this then it will get me going creating the share. I only need to create the share once off and and not need to touch it again for quite a while.

      If It is possible to do the above where would I find this bit of code / line to change this?


      I'm working on an answer, but it will be quite long so it will take some time to compose :)

      Update: I've sent an e-mail with some thoughts on the issue.

      The post was edited 1 time, last by nicjo814 ().

    • ryecoaaron wrote:

      See attached video. You can see the name scroll across the field. Each time (tried 5 times), it starts an fpm thread that chews up a lot of cpu until a timeout or even a segfault happen. I assume it happens because each time the name moves, it is calling a php function. This is a problem with the plugin and omv 2.x/extjs 5.1.

      Also, if I remember correctly, the filesystem backend is a little slow. So, the more zfs nodes (or whatever) you have, the longer it takes and can timeout if enough of them.


      This looks a bit odd :) Someone with ExtJS experience might want to look at it...
    • orionweb wrote:

      Now I'm just thinking with my problem. I'm guessing that the create share page tries to list all the filesystem using something like "zfs list -t snapshot" which would list all the zfs filesystem plus the snapshots.

      If that is the case is it possible for me to temporarily change it so that it only list the filesystem and not the snapshots? If I can do this then it will get me going creating the share. I only need to create the share once off and and not need to touch it again for quite a while.

      If It is possible to do the above where would I find this bit of code / line to change this?


      As I mentioned in my e-mail I don't think that the number of Snapshots is an issue, but the number of Datasets could probably be.
    • rbw wrote:

      So I have been following this guide and mistakenly I added the drive by path. I'm seeing the disappearing drives issue after a reboot. When I run zpool export poolname and then zpool import poolname, I do get my drives back, but then a status shows them as still being referenced by path. Is there a way to "alias" /dev/sdx with a certain uuid everytime, or should I backup the data and then recreate the pool?

      Thanks in advance!


      Have you tested with proper path as outlined here?

      zfsonlinux.org/faq.html#HowDoIChangeNamesOnAnExistingPool