Insane CPU usage, or not?

  • Hello folks.

    Last time I was making a backup from my Desktop PC to my NAS, and suddenly after 10GB copied more or less Samba freezes and locks the filesystem. I thought it was a Samba issue so I switched to SCP (WinSCP) and same thing happened, filesystem locked (can't ls, cant stat any file, can't do anything on the specific folder that was target on the copy process). Due to lack of time and the fact that I had a spare 1TB external drive I ended up rsyncing everything from the external drive to my NAS, this process ended successfully after copying 200GB.

    Anyway, today I reviewed some stats on the server and I found out that Wait I/O times are quite high. I don't know if this is common, or just a bug in the monitoring software, but got my attention obviously. I did an iotop and didn't find anything suspicious.

    Maybe this issue is related to the filesystem locks, maybe not, but the NAS doesn't feel so stable.

    NAS specs:

    - I5 4440
    - 8 GB of RAM
    - 1TB software RAID1
    - OMV running from USB stick with OMV Flash Memory plugin installed
    - Ethernet connection using Powerline.

  • Well I have the Flash Memory plugin installed. Once I click on "Reset" inside the plugin options, the IO Wait drops and returns to normal. Any idea why?. Should I keep using this plugin?

    Im currently uninstalling OMV-Flashmemory to test if issue goes away.

  • Well I have the Flash Memory plugin installed

    With active monitoring this is of limited use since the RRD need to be written constantly (and on storage with very low random IO performance this will result in having high %iowait percentage). I would test the following

    sudo apt install sysstat
    iostat 60

    This prints every 60 seconds the kernel's stats and should show the same values/percentages as OMV's monitoring. You see which devices are affected by IO (most probably your thumb drive constantly) and can then have a look whether things improve if you disable OMV's monitoring (which is a great idea anyway with USB thumb drives or running from other low-end flash due to the storage wearing out pretty fast).

  • Found them I think. Sonarr and Radarr where constantly spiking up to 30% CPU usage even when they are not configured and they have empty databases. Stopped their containers and so far so good, after 10 hours it's working just fine and crons are not missing the deadline. Besides I searched on github and found this:

    I'll try to look up for a solution. I have Sonarr and Radarr on my VPS too but I'm not observing this behaviour. Probably it's happening anyway but I can't detect it since I don't have any monitoring app. I will probably install Prometheus on my VPS so I can check if this is happening too.

    If I don't see the issue coming back up again I will reinstall OMV-Flashmemory plugin as it looks like it reduces a lot the idle IO Wait.

    Thanks for your help tkaiser.

  • +Flash memory plugin reinstalled
    +Replaced OVM monit with NetData image.
    +CPU average below 5% and IO Wait below 10% when idling.

    Tweaking + Deleting Sonarr solved the issue. Now it's time to look why Sonarr was causing this.

  • Besides I searched on github and found this:

    BTW: The mechanism to keep the databases in RAM is already present in OMV. You can tweak the flashmemory plugin to provide another path that will then remain in tmpfs. And then you only need to setup a cron job to regularly let the plugin flush contents to disk.

    IO Wait below 10% when idling

    Way too much and an indication that your USB thumb drive might be really slow when doing random IO. I would run a quick iozone test like we do on the ARM OMV images to identify SD cards that really suck performance wise: cd /root; iozone -e -I -a -s 1M -r 4k -i 0 -i 1 -i 2

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!