OMV 4 hangs randomly with high load > 100

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • OMV 4 hangs randomly with high load > 100

      Hi,

      I am reading here for a while, now i encounter an (for me) unsolvable problem..

      My system randomly hangs and is unresponsive (all docker container / services are not reachable). SSH login works but if I for example run "top" it hangs/freezes as well.


      I think (haven't verified if this is always the case) I can make the system responsive again, if I login to the webinterface (OMV GUI). Once logged in, the system will return back to idle state and remain with low load..

      Until now I couldn't find a single source for this, checked the process logs, iotop etc.

      Is it possible that the GUI will "lock" the system somehow? How can I get logs if the system hangs?

      I currently run a OMV backup with fsarchiver, which is utilizing the CPU quite a bit, but runs smoothly with load of 2.2-2.4. I don't know if this one is the cause for the initial high load and the freeze of any other processes..

      My system:

      8GB RAM
      Intel Celeron 1.6Ghz Quad-Core
      2x6TB Data (SATA3)
      1x128GB SSD System (SATA3)

      OS

      No LSB modules are available.
      Distributor ID: Debian
      Description: Debian GNU/Linux 9.6 (stretch)
      Release: 9.6
      Codename: stretch

      OMV

      Release: 4.1.17-1
      Codename: Arrakis

      System

      Linux aries 4.18.0-0.bpo.1-amd64 #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13) x86_64 GNU/Linux

      ps auxf (with normal load): pastebin.com/raw/PvWR8v80
      Images
      • img.jpg

        52.54 kB, 738×372, viewed 35 times

      The post was edited 3 times, last by raws99: added ps auxf ().

    • Yeah, all docker services are not available if system "locks" but SSH is accessable (with issues as described above) and WebUI seems to clear the lock from the system.

      File system on all disks is ext4 on storage and system disk.
      The machine is running 24/7, no powersaving on the machine itself.
      The disks have the following modes:
      Storage Raid1 Disk1: spindown
      Storage Raid1 Disk2: disabled (don't know why, maybe set this to spindown, too?)
      System Disk: Disabled (SDD)

      Drives seem fine, what do you describe as "weird"? Temperature is okay, SMART is okay..

      EDIT: Added list of docker containers..
      Images
      • docker.jpg

        66.75 kB, 1,324×252, viewed 15 times
    • Just a quick hint: the concept of 'average load' in Linux is mostly misunderstood since confusing. It's not 'CPU utilization' but load also increases once the system is stuck on I/O. Full details: brendangregg.com/blog/2017-08-08/linux-load-averages.html

      If a system is in such a state (waiting for I/O requests to be finished) it usually behaves like almost frozen. I would check logs and health of storage.
    • New

      Thanks for the hint. I think I understand the concept - I wrote a little check script to run, whenever high load is found and give me the iostat, free memory and process list of my system.

      I am confused by the spike of load > 100 therefore I mentioned it.

      Both disks are brand new, SSD is also new, everything is not older than one month - but I will investigate if the disks are somehow causing errors.
    • New

      Okay, today I got something new. I tried to access my nextcloud from remote and got the following logs and a lot of emails telling me nginx crashed (running the omv gui) etc.

      Source Code

      1. Jan 10 11:45:08 aries collectd[1044]: plugin_read_thread: read-function of the `memory' plugin took 111.053 seconds, which is above its read interval (10.000 seconds). You might want to adjust the `Interval' or `ReadThreads' settings.
      2. Jan 10 12:01:06 aries systemd[1]: Starting Clean php session files...
      3. Jan 10 12:01:06 aries collectd[1044]: Not sleeping because the next interval is 96.991 seconds in the past!
      4. Jan 10 12:20:41 aries systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
      5. Jan 10 12:20:41 aries collectd[1044]: plugin_read_thread: read-function of the `df' plugin took 117.851 seconds, which is above its read interval (10.000 seconds). You might want to adjust the `Interval' or `ReadThreads' settings.
      6. Jan 10 12:31:01 aries systemd[1]: systemd-journald.service: Killing process 335 (systemd-journal) with signal SIGABRT.
      7. Jan 10 12:31:01 aries collectd[1044]: plugin_read_thread: read-function of the `interface' plugin took 119.776 seconds, which is above its read interval (10.000 seconds). You might want to adjust the `Interval' or `ReadThreads' settings.
      8. Jan 10 12:31:01 aries systemd[1]: Starting Flush Journal to Persistent Storage...
      9. Jan 10 12:31:01 aries postfix/smtpd[7150]: disconnect from localhost[127.0.0.1] ehlo=1 mail=1 rcpt=1 quit=1 commands=4
      10. Jan 10 12:31:01 aries systemd[1]: Started Flush Journal to Persistent Storage.
      11. Jan 10 12:31:01 aries monit[1024]: Mail: Error receiving data from the mailserver -- Resource temporarily unavailable
      12. Jan 10 12:31:01 aries systemd[1]: Started Run anacron jobs.
      13. Jan 10 12:31:01 aries monit[1024]: Alert handler failed, retry scheduled for next cycle
      14. Jan 10 12:31:01 aries systemd[1]: anacron.timer: Adding 4min 8.991355s random time.
      15. Jan 10 12:31:01 aries postfix/smtpd[7150]: connect from localhost[127.0.0.1]
      16. Jan 10 12:31:01 aries systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
      17. Jan 10 12:31:01 aries monit[1024]: 'aries' mem usage of 96.9% matches resource limit [mem usage>90.0%]
      Display All

      And after that I get around 400 lines of this

      Source Code

      1. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/memory/memory-used.rrd, [1547116150:7440388096.000000], 1) failed: rrdcached: illegal attempt to update using time 1547116150.000000 when last update time is 1547116161.000000 (minimum one second step) (status=-1)
      2. Jan 10 12:33:57 aries collectd[1044]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.
      3. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/interface-enp3s0/if_errors.rrd, [1547116151:0:0], 1) failed: rrdcached: illegal attempt to update using time 1547116151.000000 when last update time is 1547116160.000000 (minimum one second step) (status=-1)
      4. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/interface-enp3s0/if_octets.rrd, [1547116151:4157840337:193606983271], 1) failed: rrdcached: illegal attempt to update using time 1547116151.000000 when last update time is 1547116160.000000 (minimum one second step) (status=-1)
      5. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/memory/memory-buffered.rrd, [1547116150:3317760.000000], 1) failed: rrdcached: illegal attempt to update using time 1547116150.000000 when last update time is 1547116161.000000 (minimum one second step) (status=-1)
      6. Jan 10 12:33:57 aries collectd[1044]: Filter subsystem: Built-in target `write': Some write plugin is back to normal operation. `write' succeeded.
      7. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: Successfully reconnected to RRDCacheD at unix:/var/run/rrdcached.sock
      8. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/cpu-3/cpu-softirq.rrd, [1547116183:63586], 1) failed: rrdcached: illegal attempt to update using time 1547116183.000000 when last update time is 1547116191.000000 (minimum one second step) (status=-1)
      9. Jan 10 12:33:57 aries collectd[1044]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.
      10. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: Successfully reconnected to RRDCacheD at unix:/var/run/rrdcached.sock
      11. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/cpu-2/cpu-softirq.rrd, [1547116183:149442], 1) failed: rrdcached: illegal attempt to update using time 1547116183.000000 when last update time is 1547116191.000000 (minimum one second step) (status=-1)
      12. Jan 10 12:33:57 aries collectd[1044]: Filter subsystem: Built-in target `write': Some write plugin is back to normal operation. `write' succeeded.
      13. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: Successfully reconnected to RRDCacheD at unix:/var/run/rrdcached.sock
      14. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/cpu-1/cpu-softirq.rrd, [1547116191:81684], 1) failed: rrdcached: illegal attempt to update using time 1547116191.000000 when last update time is 1547116204.000000 (minimum one second step) (status=-1)
      15. Jan 10 12:33:57 aries collectd[1044]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.
      16. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: Successfully reconnected to RRDCacheD at unix:/var/run/rrdcached.sock
      17. Jan 10 12:33:57 aries collectd[1044]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/cpu-1/cpu-steal.rrd, [1547116183:0], 1) failed: rrdcached: illegal attempt to update using time 1547116183.000000 when last update time is 1547116204.000000 (minimum one second step) (status=-1)
      Display All

      And after that mysqld complains

      Source Code

      1. Jan 10 12:34:12 aries mysqld: 2019-01-10 12:34:12 0 [Warning] InnoDB: A long semaphore wait:
      2. Jan 10 12:34:12 aries mysqld: --Thread 139666301327104 has waited at row0upd.cc line 3109 for 3550.00 seconds the semaphore:
      3. Jan 10 12:34:12 aries mysqld: X-lock on RW-latch at 0x7f069fdca050 created in file buf0buf.cc line 1638
      4. Jan 10 12:34:12 aries mysqld: a writer (thread id 0) has reserved it in mode SX
      5. Jan 10 12:34:12 aries mysqld: number of readers 0, waiters flag 1, lock_word: 10000000
      6. Jan 10 12:34:12 aries mysqld: Last time write locked in file buf0flu.cc line 1227
      7. Jan 10 12:34:12 aries mysqld: 2019-01-10 12:34:12 0 [Warning] InnoDB: A long semaphore wait:
      8. Jan 10 12:34:12 aries mysqld: --Thread 139666634364672 has waited at row0ins.cc line 2626 for 3550.00 seconds the semaphore:
      9. Jan 10 12:34:12 aries mysqld: SX-lock on RW-latch at 0x7f069ffc25a0 created in file buf0buf.cc line 1638
      10. Jan 10 12:34:12 aries mysqld: a writer (thread id 0) has reserved it in mode SX
      11. Jan 10 12:34:12 aries mysqld: number of readers 2, waiters flag 1, lock_word: ffffffe
      12. Jan 10 12:34:12 aries mysqld: Last time write locked in file buf0flu.cc line 1227
      13. Jan 10 12:34:12 aries mysqld: 2019-01-10 12:34:12 0 [Warning] InnoDB: A long semaphore wait:
      14. Jan 10 12:34:12 aries mysqld: --Thread 139666633160448 has waited at row0ins.cc line 2626 for 3548.00 seconds the semaphore:
      15. Jan 10 12:34:12 aries mysqld: SX-lock on RW-latch at 0x7f069ffc25a0 created in file buf0buf.cc line 1638
      16. Jan 10 12:34:12 aries mysqld: a writer (thread id 0) has reserved it in mode SX
      17. Jan 10 12:34:12 aries mysqld: number of readers 2, waiters flag 1, lock_word: ffffffe
      18. Jan 10 12:34:12 aries mysqld: Last time write locked in file buf0flu.cc line 1227
      Display All

      This is my `iostat -x` now with everything running:

      Source Code

      1. $ iostat -x
      2. Linux 4.18.0-0.bpo.1-amd64 (aries) 01/10/2019 _x86_64_ (4 CPU)
      3. avg-cpu: %user %nice %system %iowait %steal %idle
      4. 2.05 0.00 3.03 3.34 0.00 91.57
      5. Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
      6. sda 1.88 5.85 10.03 5.74 150.94 64.38 27.31 0.19 14.48 17.79 8.70 1.34 2.11
      7. sdc 9.63 2.10 129.30 5.73 8936.85 120.34 134.15 0.42 2.33 2.32 2.59 0.39 5.24
      8. sdb 9.62 2.10 129.72 5.74 8946.45 120.34 133.87 1.80 13.07 11.96 38.16 0.53 7.13
      9. md0 0.00 0.00 2.94 7.54 261.86 119.60 72.84 0.00 0.00 0.00 0.00 0.00 0.00
      Display All
      I now suspect docker being the bottleneck. Since I have all containers running on the default bridge, it might cause delays? Any recommended write up I can check for such errors?
    • New

      raws99 wrote:

      One logging problem seems to be, if the system is clogged, the logs are not generated

      In other words: Looking into the provided logs makes no sense?

      Anyway: if this is really the output of iostat 120 then what's going on with your storage? Two disks show a constant read activity of 8 MB/s with +130 transactions per second while the md0 device shows way less but also constant utilization. Disclaimer: No idea what should be 'normal behavior' since I don't use mdraid's raid1 since close to useless.
    • New

      This is correct, iostat 120 gives me the output posted (added new iostat, too).

      This the output of iotop -oPa -d 2 (running for 30min or so)

      Source Code

      1. Total DISK READ : 0.00 B/s | Total DISK WRITE : 85.66 K/s
      2. Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 124.76 K/s
      3. PID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
      4. 289 be/3 root 0.00 B 19.85 M 0.00 % 0.34 % [jbd2/sda1-8]
      5. 544 be/3 root 0.00 B 492.00 K 0.00 % 0.07 % [jbd2/md0-8]
      6. 1024 be/4 root 0.00 B 464.00 K 0.00 % 0.04 % monit -c /etc/monit/monitrc
      7. 24437 be/4 openmedi 0.00 B 436.00 K 0.00 % 0.03 % php-fpm: pool openmediavault-webgui
      8. 11225 be/4 root 12.00 K 12.00 K 0.00 % 0.01 % smartd -n --quit=never --interval=1800
      9. 32234 be/4 root 0.00 B 0.00 B 0.00 % 0.03 % [kworker/u8:2-events_unbound]
      10. 6930 be/4 www-data 4.00 K 92.00 K 0.00 % 0.01 % apache2 -DFOREGROUND
      11. 12521 be/4 mysql 124.00 K 46.36 M 0.00 % 0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir~/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
      12. 7082 be/4 www-data 4.00 K 96.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      13. 30853 be/4 root 0.00 B 0.00 B 0.00 % 0.01 % [kworker/u8:1-events_unbound]
      14. 32733 be/4 openmedi 0.00 B 96.00 K 0.00 % 0.01 % php-fpm: pool openmediavault-webgui
      15. 883 be/4 root 60.00 K 2.06 M 0.00 % 0.00 % dockerd -H unix:///var/run/docker.sock
      16. 21738 be/4 root 4.00 K 11.91 M 0.00 % 0.00 % rrdcached -B -F -f 3600 -w 900 -b /var/lib/rrdcached/db/ -j~l/ -p /var/run/rrdcached.pid -l unix:/var/run/rrdcached.sock
      17. 7159 be/4 www-data 40.00 K 88.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      18. 7026 be/4 www-data 4.00 K 88.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      19. 382 be/4 root 224.00 K 0.00 B 0.00 % 0.00 % lvmetad -f
      20. 892 be/4 nobody 0.00 B 936.00 K 0.00 % 0.00 % openvpn --daemon ovpn-server --status /run/openvpn/server.s~ /etc/openvpn/server.conf --writepid /run/openvpn/server.pid
      21. 27181 be/4 root 0.00 B 16.64 M 0.00 % 0.00 % node /opt/pimatic-docker/node_modules/pimatic/pimatic.js restart
      22. 9272 be/4 www-data 0.00 B 816.00 K 0.00 % 0.00 % nginx: worker process
      23. 1378 be/4 www-data 16.00 K 0.00 B 0.00 % 0.00 % php-fpm: pool openmediavault-mysql
      24. 6871 be/4 www-data 0.00 B 88.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      25. 32489 be/4 root 96.00 K 0.00 B 0.00 % 0.00 % docker-gen -watch -notify /app/signal_le_service -wait 15s:~/letsencrypt_service_data.tmpl /app/letsencrypt_service_data
      26. 1435 be/4 root 0.00 B 816.00 K 0.00 % 0.00 % python3 /usr/bin/fail2ban-server -s /var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ban.pid -x -b
      27. 17613 be/4 root 0.00 B 628.00 K 0.00 % 0.00 % rsyslogd -n
      28. 9273 be/4 www-data 8.00 K 424.00 K 0.01 % 0.00 % nginx: worker process
      29. 9274 be/4 www-data 0.00 B 664.00 K 0.00 % 0.00 % nginx: worker process
      30. 9275 be/4 www-data 8.00 K 156.00 K 0.00 % 0.00 % nginx: worker process
      31. 3150 be/4 root 20.00 K 0.00 B 0.00 % 0.00 % containerd-shim -namespace moby -workdir /var/lib/container~sr/bin/containerd -runtime-root /var/run/docker/runtime-runc
      32. 4227 be/4 root 8.00 K 0.00 B 0.00 % 0.00 % docker-gen -watch -notify nginx -s reload /app/nginx.tmpl /etc/nginx/conf.d/default.conf
      33. 6878 be/4 www-data 0.00 B 100.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      34. 3365 be/4 systemd- 0.00 B 8.00 K 0.00 % 0.00 % mosquitto -c /mosquitto/config/mosquitto.conf
      35. 3298 be/4 postfix 0.00 B 12.00 K 0.00 % 0.00 % tlsmgr -l -t unix -u -c
      36. 862 be/4 root 4.00 K 0.00 B 0.00 % 0.00 % containerd
      37. 7025 be/4 www-data 0.00 B 92.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      38. 7037 be/4 www-data 0.00 B 84.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      39. 7084 be/4 www-data 0.00 B 76.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      40. 7155 be/4 www-data 0.00 B 88.00 K 0.00 % 0.00 % apache2 -DFOREGROUND
      41. 1021 be/4 ntp 0.00 B 4.00 K 0.00 % 0.00 % ntpd -p /var/run/ntpd.pid -g -u 10
      Display All

      It shows me alot of writing from just the journaling service.. This could be the reason for the constant traffic?


      iostat 120 from the last hour or so: pastebin.com/g885FkAc


      UPDATE:

      High I/O of the journaling resulted in mysql being the cause. If I follow the guide here: medium.com/@n3d4ti/i-o-wait-at-mysql-import-data-a06d017a2ba I get it down to 0.X% and almost no traffic. But as pointed out in the post, this is not always a good setting for production. So what do you think? I leave it off for 1-2 days to see if it stops the spikes.

      Also I was thinking about my docker container having services logging into sqlite database which are stored on my raid. Is it a good practice to include another SSD into the system for the appdata folder? Since my raid is constantly written to, it never sleeps..

      UPDATE 2:
      I've found that using Nextcloud App on iOS to scroll through a bunch of images, will cause very high load on my system. Mostly apache process will spike. I run the official docker image and will try to investigate further

      The post was edited 6 times, last by raws99: added update2 ().

    • New

      Hello,

      First post here but thought I would share my experience.

      I have upgraded yesterday from OMV3 to OMV4 and been randomly experiencing same issues as OP.

      After doing some research and reading the post here and getting the hint from post #2 I came up with the same result.

      PHP5 was still installed after OMV4 upgrade.

      To check if check PHP5 is still installed:

      Source Code

      1. dpkg -l | grep php
      To remove leftovers of PHP5:

      Source Code

      1. apt-get purge 'php5*'


      Use at your own risk!

      All kudos go to posters in the mentioned post.

      My system has been running smooth since, no errors, no high CPU (100% and crash before), no huge load average (20+ before, 0.03 currently).

      Fingers crossed everything is back normal, at least for my system.
    • New

      Great to see you have similar issues. Not great, but good to know there's someone else ;) Since my system is currently clogged, I started investigating again..

      No high cpu usage / processes taking ram or cpu. BUT CPU waiting is high with around 24
      Now after 10min being clogged (docker containers are not processing) I recognized my light going off (which is controlled by my smart home..) So the system is "free" again, load drops immediatly. I've done nothing but opening top or iotop..

      top (clogged)

      Source Code

      1. top - 22:19:09 up 13 days, 1:44, 2 users, load average: 21,82, 18,59, 12,38
      2. Tasks: 254 total, 1 running, 204 sleeping, 0 stopped, 1 zombie
      3. %Cpu(s): 2,4 us, 1,2 sy, 0,0 ni, 71,5 id, 24,6 wa, 0,0 hi, 0,2 si, 0,0 st
      4. KiB Mem : 7643708 total, 3573692 free, 1095592 used, 2974424 buff/cache
      5. KiB Swap: 7849980 total, 7440556 free, 409424 used. 6162440 avail Mem
      6. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      7. 25186 root 20 0 51940 15924 6508 S 8,9 0,2 0:21.98 /usr/bin/python3 /usr/sbin/iotop -oPa -d 2
      8. 25548 root 20 0 46628 3908 3208 R 1,3 0,1 0:00.23 top
      9. 1044 root 20 0 971472 2196 1372 S 0,7 0,0 42:33.04 /usr/sbin/collectd
      10. 20760 root 20 0 16376 3312 1920 S 0,7 0,0 23:40.97 docker-gen -watch -notify nginx -s reload /app/nginx.tmpl /etc/nginx/conf.d/default.conf
      11. 21073 kris 20 0 30912 8136 684 S 0,7 0,1 13:03.13 redis-server
      12. 21111 root 20 0 17192 5804 4240 S 0,7 0,1 19:12.24 docker-gen -watch -notify /app/signal_le_service -wait 15s:60s /app/letsencrypt_service+
      13. 10 root 20 0 0 0 0 I 0,3 0,0 28:59.16 [rcu_sched]
      14. 3189 mysql 20 0 2792216 271180 9148 S 0,3 3,5 543:14.88 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/pl+
      15. 3365 systemd+ 20 0 9420 1000 820 S 0,3 0,0 22:33.65 /usr/sbin/mosquitto -c /mosquitto/config/mosquitto.conf
      16. 31351 root 20 0 10744 1284 804 S 0,3 0,0 6:24.22 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.l+
      17. 31436 kris 20 0 1266008 65756 8864 S 0,3 0,9 33:59.11 node-red
      18. 31614 root 20 0 0 0 0 S 0,3 0,0 21:31.91 [kdvb-ad-1-fe-0]
      19. 31616 root 20 0 0 0 0 S 0,3 0,0 21:34.00 [kdvb-ad-0-fe-0]
      20. 1 root 20 0 205144 5948 4096 S 0,0 0,1 4:15.22 /sbin/init
      21. 2 root 20 0 0 0 0 S 0,0 0,0 0:00.86 [kthreadd]
      22. 3 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 [rcu_gp]
      23. 4 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 [rcu_par_gp]
      24. 6 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 [kworker/0:0H-kb]
      25. 8 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 [mm_percpu_wq]
      26. 9 root 20 0 0 0 0 S 0,0 0,0 2:33.62 [ksoftirqd/0]
      27. 11 root 20 0 0 0 0 I 0,0 0,0 0:00.00 [rcu_bh]
      28. 12 root rt 0 0 0 0 S 0,0 0,0 0:01.94 [migration/0]
      29. 13 root rt 0 0 0 0 S 0,0 0,0 0:09.35 [watchdog/0]
      30. 14 root 20 0 0 0 0 S 0,0 0,0 0:00.00 [cpuhp/0]
      31. 15 root 20 0 0 0 0 S 0,0 0,0 0:00.00 [cpuhp/1]
      32. 16 root rt 0 0 0 0 S 0,0 0,0 0:08.65 [watchdog/1]
      33. ....
      Display All

      Healthy top:

      Source Code

      1. top - 22:23:29 up 13 days, 1:48, 2 users, load average: 0,91, 9,94, 10,51
      2. Tasks: 234 total, 1 running, 184 sleeping, 0 stopped, 1 zombie
      3. %Cpu(s): 1,5 us, 1,0 sy, 0,0 ni, 97,4 id, 0,1 wa, 0,0 hi, 0,0 si, 0,0 st
      4. KiB Mem : 7643708 total, 3621904 free, 1072232 used, 2949572 buff/cache
      5. KiB Swap: 7849980 total, 7440556 free, 409424 used. 6211944 avail Mem
      6. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      7. 25186 root 20 0 51940 15948 6508 S 3,9 0,2 0:38.21 /usr/bin/python3 /usr/sbin/iotop -oPa -d 2
      8. 26148 root 20 0 46628 3856 3100 R 1,6 0,1 0:00.18 top
      9. 27181 root 20 0 1373624 177056 8588 S 1,3 2,3 76:33.29 /usr/bin/node /opt/pimatic-docker/node_modules/pimatic/pimatic.js restart
      10. 20760 root 20 0 16376 3576 1948 S 0,7 0,0 23:43.07 docker-gen -watch -notify nginx -s reload /app/nginx.tmpl /etc/nginx/conf.d/default.conf
      11. 21111 root 20 0 17192 5804 4240 S 0,7 0,1 19:13.83 docker-gen -watch -notify /app/signal_le_service -wait 15s:60s /app/letsencrypt_service+
      12. 10 root 20 0 0 0 0 I 0,3 0,0 29:00.03 [rcu_sched]
      13. 52 root 20 0 0 0 0 I 0,3 0,0 7:31.98 [kworker/3:1-eve]
      14. 883 root 20 0 1336100 41188 12124 S 0,3 0,5 51:00.64 /usr/bin/dockerd -H unix:///var/run/docker.sock
      15. 1044 root 20 0 971472 2196 1372 S 0,3 0,0 42:33.81 /usr/sbin/collectd
      16. 2683 root 20 0 327268 3324 2844 S 0,3 0,0 2:21.47 omv-engined
      17. 3189 mysql 20 0 2792216 271180 9148 S 0,3 3,5 543:15.55 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/pl+
      18. 8503 root 20 0 70688 7700 7492 S 0,3 0,1 1:26.15 /lib/systemd/systemd-journald
      19. 17613 root 20 0 254244 2292 1932 S 0,3 0,0 0:35.65 /usr/sbin/rsyslogd -n
      20. 21073 kris 20 0 30912 8136 684 S 0,3 0,1 13:04.31 redis-server
      21. 31585 kris 20 0 730952 20300 184 S 0,3 0,3 159:26.74 /usr/bin/tvheadend -C -c /config
      22. 31614 root 20 0 0 0 0 S 0,3 0,0 21:32.35 [kdvb-ad-1-fe-0]
      23. 31616 root 20 0 0 0 0 S 0,3 0,0 21:34.50 [kdvb-ad-0-fe-0]
      24. 1 root 20 0 205144 5948 4096 S 0,0 0,1 4:15.26 /sbin/init
      25. 2 root 20 0 0 0 0 S 0,0 0,0 0:00.86 [kthreadd]
      26. 3 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 [rcu_gp]
      27. 4 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 [rcu_par_gp]
      28. 6 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 [kworker/0:0H-kb]
      29. 8 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 [mm_percpu_wq]
      30. 9 root 20 0 0 0 0 S 0,0 0,0 2:33.63 [ksoftirqd/0]
      31. ....
      Display All

      How can the waiting be investigated further? I check iotop (nothing special, not much writing..). I'll let iostat 120 run overnight, to see if there is something useful in it..