[HowTo] Log disk power states with collectd and Grafana

  • I created a little script to keep track of the disk states (spindown/spinup) using collectd, InfluxDB and Grafana.
    It is based on this script: https://github.com/collectd/co…ter/contrib/exec-smartctl
    Now I can see in a nice graphic about which drive was spun up or down at wich point.



    Requirements:


    • collectd (to collect the data; is already part of OMV)
    • InfluxDB (to store the data)
    • Grafana (to show the data)

    I will not explain how to install InfluxDB and Grafana. There are plenty of tutorials available.


    First, you need to configure collectd to send the data to InfluxDB.
    Create the following file to configure collectd to send data to your InfluxDB-Server.
    Don't forget to change the ip address.

    EDIT: I think I forgot to mention you need to copy https://github.com/collectd/co…/blob/master/src/types.db to /usr/share/collectd/types.db.
    And also see this guide to prepare InfluxDB to receive data from collectd. https://anomaly.io/collectd-metrics-to-influxdb/




    Create a user named 'smart' that has the permission to execute 'smartctl' with sudo.


    Code: /etc/sudoers.d/smart
    Cmnd_Alias SMARTCTL = /usr/sbin/smartctl
    smart ALL = (root) NOPASSWD: SMARTCTL


    Create the following file which will read the power state of your drives.
    Change the list of drives ("[..] sda sdb sdc ...") according to your system.

    If you execute the script manually it will take 60 seconds before the first output is printed.


    As a bonus, here is the slightly modified version of the original exec-smartctl script.
    This will read the disk temperature only when the disks are not in standby.


    Don't forget to make the scripts executable.
    Now we need to tell collectd to use these scripts.




    Finally execute the following commands to create and activate the new collectd config.


    Bash
    # create config
    sudo omv-mkconf collectd
    # Restart collectd (not sure if necessary)
    sudo systemctl restart collectd.service



    At this moment the data should be sent to your InfluxDB.
    The next step is to create a nice graph for this data.
    To display the graph with discrete values install this plugin: https://grafana.com/plugins/natel-discrete-panel


    I used this dashboard template as a starting point: https://grafana.com/dashboards/554
    You can import my graphs using this json data.



    Or create the graphs manually according to this screenshots.



    I hope this will be of use for someone.
    And feel free to give feedback in any way. :D

  • Hello,


    I followed your guide. It is well written and it was easy to follow. Thanks!
    It all seems to work; when running
    /usr/share/collectd/exec-disk_state.sh
    I get values after 60s:
    PUTVAL homeserver.fritz.box/exec-sdi/gauge-disk_state interval=60 N:U


    But the dashboards I imported remain empty.
    Also, if I create a new dashboard, I cannot find the data in the dropdown box under 'select measurement'


    I would now like to check whether the Data is arriving in Influx. Do you have a
    SELECT * FROM xyz
    like query for me, that would help?


    Greetings,
    Hendrik

  • Sorry for the delay.


    I don't know how the commands work of the top of my head.
    But just get familiar with the CLI of InfluxDB.


    You can list all available measurements (like SQL tables) and their content (values).
    If you do not see your measurements there if a problem with writing them.

  • Hello,


    I found a simpler solution, based on yours.


    No need for any Influx Plugins.
    Just put this script in the crontab:

    and execude it every five minutes.
    Thanks for the Inspiration!
    One more Improvement would be to use disk-by-uuid, as the sd* changes on reboot.


    Greetings,
    Hendrik

  • Hi, does your solution still apply in OMV4 and until which step did you follow the guide?


    Thanks a lot

  • Currently no, i don't believe Linux has that option so I cannot access it either.

    Here is an interesting thread about finding out what spins up disks. No idea if you can work that into your plugin, but it would be awesome
    https://forum.openmediavault.o…y-HDD-waking-up/?pageNo=1


    I installed your plugin yesterday and already saw that both data disks wake up at midnight for a few minutes. That does sound a lot like a daily cron job. I don't have any configured, but i will try and find out more.


    Plugin works great so far! Amazing job!!!

  • Which example is correct (if any)?
    "/dev/sda" or "/sharedfolders/tvshows" or "/dev/disk-by-label/WDRed1"

    /dev/sda is the device
    /sharedfolders/... is a bind mount to the mountpoint
    /dev/disk-by-lable... is the mount point.

    omv 5.5.17-3 usul | 64 bit | 5.4 proxmox kernel | omvextrasorg 5.4.2
    omv-extras.org plugins source code and issue tracker - github


    Please read this before posting a question.
    Please don't PM for support... Too many PMs!

  • Here I explain how to find the cause/the process accessing


    ständiger Plattenzugriff

    This is an interesting option. Can't think of a way of integrating this into the current functionality right away but it's worth looking into.
    Maybe I might create an option to log a certain drives and see the logs (separate of the current graphs) since i think running these logs permanently will create quite a decent size log.
    For now doing this manually is the best option i think, this also shouldn't probably be something you have running permanently (although i'm no expert).

  • Agreed, this looks like something interesting to add to the plugin
    Might look at it sometime in the future. I guess that fatrace runs in foreground (as in process has to keep running)?
    That would mean I'd have to find a way to run it in the background and stop it after x time

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!