[HOWTO] Instal ZFS-Plugin & use ZFS on OMV

  • He is talking of hardware specs, not rumors.


    I wasn't saying that it were rumors.


    Every 12 TB read you get something that is corrupted.


    I have to apologize, I thought about Gigabytes earlier, not TB. But don't forget that you read from multiple disks, not a single one.


    Good controllers stop and tell you of the issue so you can move the data off it, bad controllers ignore it and corrupt the whole array.


    The question is, does that happen on every 12TB? That would technically mean that on nearly every scrub I would destroy a sector on my array.


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • The question is, does that happen on every 12TB? That would technically mean that on nearly every scrub I would destroy a sector on my array.

    That's usually just data loss, not hardware damage (although a failed sector is also data loss of course).
    Look at the SMART of the disks, they log it as Reported Uncorrectable Errors, ID 187.


    Like all probabilistic events it is wildly mercurial, you might get 10 in a row within a minute or none whatsoever for years.

  • Look at the SMART of the disks, they log it as Reported Uncorrectable Errors, ID 187.



    Seems like WD Reds do not have that ID active.


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • Can't execute smartctl like that because I'm using a 3ware controller and the drives are not exposed as /dev/sdx.


    Greetings
    David

    "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"


    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.

    Upload Logfile via WebGUI/CLI
    #openmediavault on freenode IRC | German & English | GMT+1
    Absolutely no Support via PM!

  • I don't have a solution for this unfortunately. I would guess that the timeout is specified somewhere in the "core" code of OMV. I'll have to test setting up a similar scenario as you describe in a virtualized environment to see if I can reproduce the problem. However I won't be able to do this next couple of days.


    Thank you. I'll wait for your test result. In the mean time I've gone crazy and went and (stab in the dark) changed any timeout value found in the *php.ini and nginx / omv I could find to 300 (seconds). Unfortunately this didn't help :(


    I've got over 300 dataset and most many of them have tens of snapshots. The current timeout is not enough to build this list.

  • I don't have a solution for this unfortunately. I would guess that the timeout is specified somewhere in the "core" code of OMV.

    While this isn't my field nor I can say I know OMV, I did slam my face on similar issues somewhere else.


    I'm suspecting that his issue is similar to this one Constant WebGUI timeout errors
    Which I suspect has something to do with nginx's proxy_pass settings, i.e. the fact that the webUI is communicating with other webUIs or local proxies to run for various components and windows and so on.
    It's not monolithic.
    This is an example of what I said, being used for direct accessing the daemons or components. Daemon webUI access through prefix/hostname [Nginx] [Proxy_pass]


    if you see the logs in that thread, there is a "server" called openmediavault-gui (obvious function) and a client called ::ffff:10.0.0.113, which is a local address. And there is a timeout in the communication between them, that causes the failure of the webinterface.


    Can someone that has this "too much stuff, loading fails" issue go and fetch his nginx error logs?
    It would be useful to confirm my suspects.
    Since I have no idea of where is the main log and where are (if there are) secondary logs,
    do this to find them

    Code
    ps axu | grep nginx

    should give you the numeric ID of nginx process,

    Code
    lsof -p NUMERIC_ID | grep log

    of course write the numeric ID you found earlier, not the NUMERIC_ID which is just a placeholder.


    Should return all logfiles opened by nginx, so all logfiles that may have errors in them.

    Code
    cat /path/to/file

    to show contents in terminal.


    Main log file is of course the most likely to have these errors, but I don't know.
    Not near an OMV box atm so I cannot go look at mine.


    Also the openmediavault-webgui_error.log


    Possible solutions are about setting nginx to increase timeout of the proxy istances, because if you change timeout of the server instance (what you did with the setting you changed and the same thing done in that thread) does not carry over to the proxy or the reverse.
    see here http://www.beyondlinux.com/201…inx-504-gateway-time-out/
    or here http://howtounix.info/howto/11…-timed-out-error-in-nginx


    While I don't encourage anyone to jump the gun and try these settings blind, if you feel brave or foolish enough (or have backups and the time to restore them) that's another piece of evidence for my suspects.

    While I'm probably unable to fix this myself in the source (to "fix" this properly, someone has to add logic coding that checks for folders and increases timeout settings accordingly, I doubt that placing everyone at 9999999999 timeout flat for these things is a good practice, I don't know OMV enough to do that atm), if there is some evidence that this is the culprit I (or anyone) can go and open a bug report in the bugtracker so Volker can act and fix it.


    And of course maybe sticky a thread with the issue and the manual fix until this gets patched.

  • See attached video. You can see the name scroll across the field. Each time (tried 5 times), it starts an fpm thread that chews up a lot of cpu until a timeout or even a segfault happen. I assume it happens because each time the name moves, it is calling a php function. This is a problem with the plugin and omv 2.x/extjs 5.1.


    Also, if I remember correctly, the filesystem backend is a little slow. So, the more zfs nodes (or whatever) you have, the longer it takes and can timeout if enough of them.

    Files

    • zfs.zip

      (755.59 kB, downloaded 243 times, last: )

    omv 5.5.12 usul | 64 bit | 5.4 proxmox kernel | omvextrasorg 5.4.2
    omv-extras.org plugins source code and issue tracker - github


    Please read this before posting a question.
    Please don't PM for support... Too many PMs!

  • So I have been following this guide and mistakenly I added the drive by path. I'm seeing the disappearing drives issue after a reboot. When I run zpool export poolname and then zpool import poolname, I do get my drives back, but then a status shows them as still being referenced by path. Is there a way to "alias" /dev/sdx with a certain uuid everytime, or should I backup the data and then recreate the pool?


    Thanks in advance!

  • bobafetthotmail, I'll have a look at your suggestion in details when I get a chance. I can afford to lose this data so that is not a concern.


    Quote

    You can see the name scroll across the field. Each time (tried 5 times), it starts an fpm thread that chews up a lot of cpu until a timeout or even a segfault happen. I assume it happens because each time the name moves, it is calling a php function. This is a problem with the plugin and omv 2.x/extjs 5.1.


    You may be onto something there, sometime when I try to add a share folder the console comes up with the below:

    Code
    Message from syslogd@adm-nas-svr at May 15 10:15:35 ...
    kernel:[231479.736595] VERIFY3(nvlist_pack(nvl, &packed, sizep, 0, 0x0000) == 0) failed (14 == 0)
    Message from syslogd@adm-nas-svr at May 15 10:15:35 ...
    kernel:[231479.736635] PANIC at fnvpair.c:81:fnvlist_pack()


    I look at the nginx but could not see anything relevant all I can see in the file @/var/log/nginx/openmediavault-webgui_access.log (all other nginx log file are empty) are below


    The FileSystem tab seems to be able to list all the zfs filesystem after about 60 seconds. Is there a way to change the add share section to not parse the zfs snapshot. This should make it much less to parse through.


    Another thing is the ZFS tab does not list my zpool! (just empty as if zpool are available)

  • Now I'm just thinking with my problem. I'm guessing that the create share page tries to list all the filesystem using something like "zfs list -t snapshot" which would list all the zfs filesystem plus the snapshots.


    If that is the case is it possible for me to temporarily change it so that it only list the filesystem and not the snapshots? If I can do this then it will get me going creating the share. I only need to create the share once off and and not need to touch it again for quite a while.


    If It is possible to do the above where would I find this bit of code / line to change this?

  • Now I'm just thinking with my problem. I'm guessing that the create share page tries to list all the filesystem using something like "zfs list -t snapshot" which would list all the zfs filesystem plus the snapshots.


    If that is the case is it possible for me to temporarily change it so that it only list the filesystem and not the snapshots? If I can do this then it will get me going creating the share. I only need to create the share once off and and not need to touch it again for quite a while.


    If It is possible to do the above where would I find this bit of code / line to change this?


    I'm working on an answer, but it will be quite long so it will take some time to compose :-)


    Update: I've sent an e-mail with some thoughts on the issue.

  • See attached video. You can see the name scroll across the field. Each time (tried 5 times), it starts an fpm thread that chews up a lot of cpu until a timeout or even a segfault happen. I assume it happens because each time the name moves, it is calling a php function. This is a problem with the plugin and omv 2.x/extjs 5.1.


    Also, if I remember correctly, the filesystem backend is a little slow. So, the more zfs nodes (or whatever) you have, the longer it takes and can timeout if enough of them.


    This looks a bit odd :-) Someone with ExtJS experience might want to look at it...

  • Now I'm just thinking with my problem. I'm guessing that the create share page tries to list all the filesystem using something like "zfs list -t snapshot" which would list all the zfs filesystem plus the snapshots.


    If that is the case is it possible for me to temporarily change it so that it only list the filesystem and not the snapshots? If I can do this then it will get me going creating the share. I only need to create the share once off and and not need to touch it again for quite a while.


    If It is possible to do the above where would I find this bit of code / line to change this?


    As I mentioned in my e-mail I don't think that the number of Snapshots is an issue, but the number of Datasets could probably be.

  • So I have been following this guide and mistakenly I added the drive by path. I'm seeing the disappearing drives issue after a reboot. When I run zpool export poolname and then zpool import poolname, I do get my drives back, but then a status shows them as still being referenced by path. Is there a way to "alias" /dev/sdx with a certain uuid everytime, or should I backup the data and then recreate the pool?


    Thanks in advance!


    Have you tested with proper path as outlined here?


    http://zfsonlinux.org/faq.html…angeNamesOnAnExistingPool

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!