Upgraded CPU, load average monitoring parameters not updated

  • I recently updated the CPU on my Precision Tower 5810 from a Xeon E5-1603 v3 (4 cores, no HT) to a E5-2690 v4 (14 cores + HT). It's amazing what you can get for $25 on ebay.


    The upgrade went great and it solved a performance issue I was having. Before upgrading, I was occasionally receiving loadavg (1min) > 8 and loadavg (5min) > 4 monitoring alerts.


    However, after upgrading I still occasionally receive loadavg monitoring alerts. The puzzling part is that the alert thresholds are still loadavg (1min) > 8 and loadavg (5min) > 4 while I would expect them to be loadavg (1min) > 56 and loadavg (5min) > 28 due to /proc/cpuinfo showing 28 processor units.


    The monitoring code (link) is:

    Code
        if loadavg (1min) > {{ grains['num_cpus'] * loadavg_1min_mult | float(1.0) | round(1) }} for {{ loadavg_1min_cycles }} cycles then alert
        if loadavg (5min) > {{ grains['num_cpus'] * loadavg_5min_mult | float(1.0) | round(1) }} for {{ loadavg_5min_cycles }} cycles then alert


    I suspect what is happening is that grains['num_cpus'] is cached.


    I was wondering if I can invoke this code (link) to update the grains info, but I could not find a way to execute it after reading the developer section of the website and searching on the forum.

    Code
    refresh_grains:
      module.run:
        - saltutil.refresh_grains:
          - refresh_pillar: True


    This obviously isn't critical, but my understanding of the code is that the stale 'num_cpus' info is causing false positive monitoring alerts for me.


    Any advise on how to correct this appreciated!

  • cwlucas41

    Hat das Label gelöst hinzugefügt.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!