ZFS scrub job uses a lot of system ressources -> email notifications -> monit alert -- Resource limit matched

  • Hi guys,


    tonight my openmediavault server informed/alerted me about load avarage matching the resource limit. I got round about 30 Emails.


    So I googled around and found this thread: monit alert -- Resource limit matched/succeeded localhost But that thread doesn't really help, because I have a high load avarage and not high space usage.


    I had the feeling that there may be a running scrub job. zpool status gives me the following output:


    BINGO! top gives me the following output:


    I did not use "top" very often in the past, because I never had resource problems. I did not change the applications I use my openmediavault for. A little bit of home automation (fhem server), docker, tvheadend, unifi and emby.


    But I updated the kernel, the zfs and the openmediavault versions. I use the latest kernel from backport. Maybe something changed.



    Code
    root@omv4:~# uname -a
    Linux omv4 4.18.0-0.bpo.1-amd64 #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13) x86_64 GNU/Linux
    Code
    root@omv4:~# dpkg -l | grep zfs
    ii  libzfs2linux                        0.7.11-1~bpo9+1                amd64        OpenZFS filesystem library for Linux
    ii  openmediavault-zfs                  4.0.4                          amd64        OpenMediaVault plugin for ZFS
    ii  zfs-dkms                            0.7.11-1~bpo9+1                all          OpenZFS filesystem kernel modules for Linux
    ii  zfs-zed                             0.7.11-1~bpo9+1                amd64        OpenZFS Event Daemon
    ii  zfsutils-linux                      0.7.11-1~bpo9+1                amd64        command-line tools to manage OpenZFS filesystems

    Here a grep of monit messages in my syslog:



    Here are some Screenshots from the load avarage:




    So, the high load avarage started tonight at 0 o'clock. And in the graphic "by year" you can see that there is an high load avarage every month, because there is a planned scrub job once a month. In June 2018 I replaced all disks of my zfs pool to larger ones, which also took a lot of system resources.


    What do you think, what should I do?


    • Increase the load avarage value to reduce the notification emails. Where is this possible?
    • Decrease the resources the scrub job can use. Where is this possible?


    Thanks for helping!


    Regards Hoppel

    ----------------------------------------------------------------------------------
    openmediavault 6 | proxmox kernel | zfs | docker | kvm
    supermicro x11ssh-ctf | xeon E3-1240L-v5 | 64gb ecc | 8x10tb wd red | digital devices max s8
    ---------------------------------------------------------------------------------------------------------------------------------------

    Einmal editiert, zuletzt von hoppel118 ()

  • Some more information about my system and it's usage:



    My Xeon is not the most powerful, but it should be enough for my use cases. ;)




    There is also a lot of free memory:



    In my opinion the cpu and the memory is not the problem.


    Regards Hoppel

    ----------------------------------------------------------------------------------
    openmediavault 6 | proxmox kernel | zfs | docker | kvm
    supermicro x11ssh-ctf | xeon E3-1240L-v5 | 64gb ecc | 8x10tb wd red | digital devices max s8
    ---------------------------------------------------------------------------------------------------------------------------------------

  • There are a lot of tunables for zol:


    I found this issue from 2012 at github, where the developer @behlendorf listet some of the relevant tunables for scrub io performance:


    There are tunables for this, however we haven't gone to any great lengths to tune each to the exact right value. The current setting were brought over from OpenSolaris and may not be exactly right for Linux. And feedback you can proved on what the default should be would be helpful.
    int zfs_top_maxinflight = 32; /* maximum I/Os per top-level */ int zfs_resilver_delay = 2; /* number of ticks to delay resilver */ int zfs_scrub_delay = 4; /* number of ticks to delay scrub */ int zfs_scan_idle = 50; /* idle window in clock ticks */ int zfs_scan_min_time_ms = 1000; /* min millisecs to scrub per txg */ int zfs_free_min_time_ms = 1000; /* min millisecs to free per txg */ int zfs_resilver_min_time_ms = 3000; /* min millisecs to resilver per txg */ int zfs_no_scrub_io = B_FALSE; /* set to disable scrub i/o */ int zfs_no_scrub_prefetch = B_FALSE; /* set to disable srub prefetching */


    But I do not understand what I are these tunables for. So, at first I had a look for my configuration:




    Since behlendorf commented the issue in 2012 nothing changed for the default configuration values. My configured values use the default values for all of these tunables.



    Do you know some kind of documentation where these tunables are described in more detail?


    How did you tune your zpool?



    Regards Hoppel

    ----------------------------------------------------------------------------------
    openmediavault 6 | proxmox kernel | zfs | docker | kvm
    supermicro x11ssh-ctf | xeon E3-1240L-v5 | 64gb ecc | 8x10tb wd red | digital devices max s8
    ---------------------------------------------------------------------------------------------------------------------------------------

    • Offizieller Beitrag

    I haven't used ZFS for some time now, not since I used nas4free, however I did find some info based upon your title, and the problem could be related to the ZFS ARC Cache.


    But I did find this site which has further links within the article which may be of some use.

    • Offizieller Beitrag

    That's a huge pool, so I guess you wouldn't want to hear anything about restoring from backup? (Just kidding :) )
    ___________________________________________________________________


    On the E-mails, I simply unchecked the boxes under System, Notification, Notifications Tab, System. The rational behind that decision was the sporadic nuisance E-mails, along with what was being monitored. If something in the system itself is critical, (software or hardware) it's unlikely that an E-mail would give enough heads up to prevent an actual failure.
    I set notifications for storage only, where noting hard drive SMART errors might be useful, before a drive fails completely.
    __________________________________________________________________


    What I find notable is the amount of memory in use on your server. In my experience, ZFS will runs a large Page cache but actual memory usage is not high. (I'm not running dedup.)
    But,, having no experience with a pool the size of yours, I don't know. Perhaps memory usage scales with the size of the pool or you have other functions running.
    ___________________________________________________________________


    I just set up a ZFS mirror (with 1.5TB of data), a few weeks back, on box with a 64 bit Atom processor and a mere 4GB of ram. It's fully up-to-date with OMV4, kernel 4.18.0 and the latest ZFS plugin. (ZOL v0.7.11-1)
    I'm running a scrub manually, and will leave it up for a couple weeks or so to collect some performance stat's for comparison.


    BTW: I don't think your CPU has anything to do with this either.
    When I ran the first scrub on the Atom box, it actually ran faster than the i3 CPU on my primary server, which has 3 times the RAM. Since the pool size is identical between the two, I attributed the obvious difference in speed, to the speed of the drives involved. The Atom box has 7200 RPM drives, versus the i3 with 5400rpm drives.
    While this is just a single side-by-side comparison, CPU speed and quantity of RAM doesn't have the impact one might think it would. (I could show you performance stat's from the i3 but, since it's still running OMV3, it probably wouldn't apply.)
    ____________________________________


    Edit: the Atom processor box (OMV4), with 4GB ram, completed a scrub of 1.46TB in 3h16m. The same scrub was done with an i3 (OMV3) and 12GB RAM in 5h00m. Disk speed appears to be more of a factor than CPU or RAM.

  • In my opinion the cpu and the memory is not the problem


    Sure, it's the concept of average load in general on Linux: http://www.brendangregg.com/bl…/linux-load-averages.html


    In other words there is no problem: Simply ignore the notifications and if you want to find out why 'average load' is high I suggest installing sysstat package and then running 'iostat 60' in parallel next time the scrub runs.

    • Offizieller Beitrag

    With your pool being raidz, your parallel read/write throughput is much higher than my little mirror (effectively one disk).
    While it's obviously not an apples to apples comparison, this is what my scrubs look like.


    OM4 (kernel 4.18.0), 64 bit Atom, 4GB of ram, ZOL v0.7.11-1) 1.5TB zmirror.






    While not applicable, the following weekly's are from OMV 3.0.99, running on an i3, 12G RAM, with the same mirror (drives are a bit slower). The 15th was when the last scrub kicked off.



  • Hi guys,


    sorry for the late response. Didn't have much time the last weeks. Anyway, I want to thank you for your answers!


    That's a huge pool, so I guess you wouldn't want to hear anything about restoring from backup? (Just kidding )

    Harharrrr ;)

    On the E-mails, I simply unchecked the boxes under System, Notification, Notifications Tab, System. The rational behind that decision was the sporadic nuisance E-mails, along with what was being monitored. If something in the system itself is critical, (software or hardware) it's unlikely that an E-mail would give enough heads up to prevent an actual failure.
    I set notifications for storage only, where noting hard drive SMART errors might be useful, before a drive fails completely.

    Ok, but I like the idea, that my systems informs me about any problem. I do not like the idea to to uncheck the boxes under the notifications tab.


    What I find notable is the amount of memory in use on your server. In my experience, ZFS will runs a large Page cache but actual memory usage is not high. (I'm not running dedup.)
    But,, having no experience with a pool the size of yours, I don't know. Perhaps memory usage scales with the size of the pool or you have other functions running.

    Yeah, I also recognized that my server uses a high amount of memory and I am also not running dedup. But this was already the case, before I replaced the 8x4TB WD Red to 8x10TB WD Red in my raid-z2 in June of this year. On the other side it's only about 50% of my ram:



    Code
    root@omv4:~# free -h
                  total        used        free      shared  buff/cache   available
    Mem:            62G         29G         30G         43M        3,5G         33G
    Swap:           63G          0B         63G

    Do you know any command to check what uses the ram exactly?



    With your pool being raidz, your parallel read/write throughput is much higher than my little mirror (effectively one disk).
    While it's obviously not an apples to apples comparison, this is what my scrubs look like.

    Again I want to thank you for all the work you investigated into this.


    My wd reds are also 5400rpm disks, but I have 8 of them. So, yes... my parallel read/write speed is much higher than with a mirror pool and it's not really comparable.


    Sure, it's the concept of average load in general on Linux: brendangregg.com/blog/2017-08-08/linux-load-averages.html


    In other words there is no problem: Simply ignore the notifications and if you want to find out why 'average load' is high I suggest installing sysstat package and then running 'iostat 60' in parallel next time the scrub runs.

    I will have a look at the link you posted and sysstat. I also understood that there is not really a problem. But my omv informed me with 30 emails about the high load average. This is simply to much. As I understand it now, I have to disable the check box and I won't see any emails about this again, but it's not possible to reduce that amount of emails.


    It would be great if anybody has the answers to the following both questions:



    Thanks and regards Hoppel

    ----------------------------------------------------------------------------------
    openmediavault 6 | proxmox kernel | zfs | docker | kvm
    supermicro x11ssh-ctf | xeon E3-1240L-v5 | 64gb ecc | 8x10tb wd red | digital devices max s8
    ---------------------------------------------------------------------------------------------------------------------------------------

  • Hi guys,


    this issue got solved by it self. The only thing changed is a kernel update from 4.18 to 4.19. I didn't get any emails regarding "ressource limit succeeded" from monit. There were already two scrubs with kernel 4.19 installed.


    Thanks for all your suggestions and your help.


    Regards Hoppel

    ----------------------------------------------------------------------------------
    openmediavault 6 | proxmox kernel | zfs | docker | kvm
    supermicro x11ssh-ctf | xeon E3-1240L-v5 | 64gb ecc | 8x10tb wd red | digital devices max s8
    ---------------------------------------------------------------------------------------------------------------------------------------

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!