Cannot access Omv after some uptime, every service is unreachable

    • Cannot access Omv after some uptime, every service is unreachable

      Hi everyone,

      Since a couple of months, I'm having some weird issues with Omv4.

      As soon as I boot up the machine everything works fine, I can reach the smb shares, ssh into it, login through the webinterface, check transmission everything's good,
      BUT as soon as the machine stays uptime for some time, it begin to be "unreachable".

      I put it on quotes, because in fact the machine is online I can ping it, and I can see it through the modem connected, but I cannot login or use any of the services in it.
      example:

      1. I try to login through the webserver -> firefox doesn't complain about not reaching it (so the site weblogin "is present") but the page is blank no banner no nothing
      2. I try to ssh into it, it just answer back that the server needs a passphrase to login (which it needs indeed, but is like it doesn't see my private key that I sent to it, so it doesn't let me login in)
      3. samba shares complain about being not unreachable
      The only fix I found is to hard reset the machine so everything boots up again it works as it should for some time and then everything back to square 1.

      Unfortunately I cannot investigate the problem any further with logs, because doing an hard reset it basically erases any traces or at least traces I can think of.

      Did any of you experiences same issues and know how to solve it? or can give me an idea on what to look for?


      Some specs:
      OS: Omv 4.2.22-1 Arrakis
      kernel: 4.19.0-0.bpo05-amd64
    • Wek wrote:

      or can give me an idea on what to look for?
      I would start to monitor the behavior (and I don't get why you have nothing in the logs -- are you using the flashmemory plugin? If yes, then create an hourly cronjob executing /sbin/folder2ram -syncall). You could execute something like /usr/bin/iostat 600 >/root/iostat.log to get an idea about the overall system behavior (iostat monitors disk activity and overall system utilization -- on x86 you need to install the sysstat package for it to work).
      No more contributions to this project until 'alternative facts' (AKA ignorance/stupidity) are gone
    • No I'm not using flashmemory plugin, I just supposed it has been deleted because from the logs through the openmediavault webpage doesn't look anything strange everything seems to be fine, except most of the logs are deleted like syslog etc. because I had to hard reset the machine to get back to it.

      Also I looked at dmesg errors, nothing strange expect some warning about ACPI errors due to bios not really good implementation of it, but nothing strange it's been there since the beginning, also another error from the docker container of emby, saying "emby-daemon.service: Failed at step USER spawning /usr/bin/mono-sgen: No such process" again don't know why is saying that at the startup of the deamon, but everything works fine (once I rebooted of course).

      Syslog dmesg and Messages, seems pretty normal so I didn't know what to look for, I'm using the command you suggested, so let's see what happens, but I guess I have to make it as a cronjob am I rihgt? because otherwise it will be killed as soon as my shell get cut out due to the server strange behaviour.
    • Anyway I just went through a run test of it, but I can't understand, I'm a bit confused by the suggestion, here's the result:

      of the command:

      Linux 4.19.0-0.bpo.5-amd64 (wai) 06/13/2019 _x86_64_ (4 CPU)


      avg-cpu: %user %nice %system %iowait %steal %idle
      1.94 2.20 1.43 5.64 0.00 88.78


      Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
      sda 105.77 1195.04 587.14 1173971 576788
      sdc 0.30 11.54 0.49 11337 486
      sde 0.35 14.40 1.03 14144 1014
      sdb 2.73 38.57 3.04 37889 2989
      sdd 34.48 283.68 189.21 278681 185870


      How an hard disk\cpu monitor should help me out to understand why after some uptime the services seems to be like "cut off" from the lan? what should I found in it? it just spits out statistics about reads\writes and cpu usage...am I missing something?
    • Wek wrote:

      am I missing something?
      Sure. The 600 in /usr/bin/iostat 600 >/root/iostat.log. This will update output every 600 seconds and if you see then %iowait at above 50% and one device busy you know the device is stuck in IO. Or you see %user at 100% and know to search somewhere else.
      No more contributions to this project until 'alternative facts' (AKA ignorance/stupidity) are gone
    • Users Online 1

      1 Guest