HELP: My WEB GUI, SMB mounts and SSH all stops working suddenly, while docker containers work fine. And after a restart everything is working well again, what can be the issue? (Logs attached)

  • I am using a USB stick with Flashmemory plugin. And under SMART the Status of USB stick shows up as unknown.

  • I have changed the docker directory to a different disk.
    Output is
    Docker Root Dir: /srv/dev-disk-by-uuid-ofotherdisk/docker
    I also tried testing badblocks on OS USB Stick by using  badblocks -v /dev/sdc2 


    Code
    Checking blocks 0 to 13489151
    Checking for bad blocks (read-only test): done
    Pass completed, 0 bad blocks found. (0/0/0 errors)


    So yes all the docker containers works fine throughout, only the OMV WebGUI, SMB shares and SSH stops working, which kicks back up normally after I manually reboot.

  • Other than your USB stick might be failing or the memory dimm (you don't say your specs so just throwing darts here) I don't see any other possibility.

  • Ahh! I see, so is that the Badblocks check doesn't detect if my USB stick is failing or not?

    for Specs I have 8 GB RAM and 16 GB of USB stick with FlashMemory Plugin and i7 4th Gen Processor. For Data Disk I have 2 TB of Storage on other HDD where my Docker is.

    And if my USB stick is failing wouldn't it affect the Docker as well? As it is stored on other HDD, but still as far as I know /var/run/docker.sock still runs from USB Stick (OS Drive) which should affect Portainer, while it work fine, when WebGUI, SMB and SSH are down.

    Thank you so much for the reverts.

  • /var/run/docker.sock still runs from USB Stick (OS Drive)

    The system runs mostly on memory. The OS drive only holds the files/folder until they're loaded to memory on boot/reboot.

    Of course, once modifications are made on the system, they are written to drive.


    The fact that you are constantly seeing errors on the OS drive:

    Code
    [179447.175067] EXT4-fs error (device sdc2)

    Means something is wrong with it.


    What is awkward is saying _ext4_find_entry:1682: inode #2: comm (ntainerd):

    It should be showing (containerd)


    Perhaps this is a sign that your FS is corrupted in some parts (just wondering)

  • Is there any way to fix it just wondering.

    You can try to fsck the FS. Since it's the OS drive, you can't do it Live.

    To force it on next boot, do as root:

    touch /forcefsck


    Reboot and the system will do a fsck.

    See if it corrects itself.


    Or maybe you can use omv-regen to move the system to another new USB drive.

    Cloning the old one will probably carry on any errors from it

  • Great! Thank you for all the help it means a lot. I tried fsck earlier but it didn't work due to it being OS drive.

    I wasn't aware of touch /forcefsck, I would do it right away. Thank you so much.

    I would also look further into omv-regen too.

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!