Posts by tiranor

    Maybe there's a configuration error in the containers?

    If you post the configuration, someone might be able to help.

    Thanks for your answer, weirldy enough, i didn't get a notification for it.


    I troubleshooted with Copilot365 (really good when it comes to read thousands of lines of logs, searching on internet and giving me links correlated to my errors), it seems the most probable culprit is the igpu split i do to have multiple gpus i can passthrough to my VMs, and the drivers it requires :

    1. when doing this, we need to use dmks to compile the drivers
    2. we can have instability when the Proxmox VE kernel and VM kernel are too far apart => i now use the proxmox kernel for OMV with omvextras
    3. as we can split the GPU up to 7 vGPUs, guides usually make us create 7 vGPUs, but it seems for a fixed total load, it's easier to hang if there is more vGPUs => i reduced to 3, which is enough
    4. After testings, i didn't have to restore the OMV VM, i just had to restart it (usually multiple times)


    Still, with 2. and 3., those hangs are (for the moment) things of the past.

    I just discovered my old post, a lot of things happened since then :

    • to reduce wear on the 4 10TB HDDs, i used the esata for a 1TB SSD, where i stored all my containers, and all data that were often read and written.
    • at one time, i went from 2x4 to 2x8GB of RAM


    Since last year, it seemed it was getting harder and harder to make it work properly :

    • i used GPU acceleration on some containers, but if i really used the GPU (instead of CPU), the server crashed after about 1 day, and i couldn't even restart it with the connected plug (i had to physically reset it with the power button, else it hanged at restart)
    • sometimes it just crashed, and i had the same issue on reboot, thus i wasn't confident at all when i went to vacations for several weeks.


    I eyed all those Chinese NUCs and N100/N305 motherboards, and this year, i finally bought a used Optilplex 5000 micro, with the 12500T CPU, 32GB of RAM, and 500GB NVMe, for 250€. I reused my old Acer case, with a ASM1166 NVMe->6 SATA card.


    "Traumatized" from the crashes my OMV had, i virtualized it on Proxmox, and i haven't looked back. The CPU is miles ahead, and now with the GPU acceleration, i even managed to add Frigate and AI detection for my local IP cams.

    I still happen to have GPU crash (it's weird, the GPU crash just hangs my GPU accelerated containers), which weirdly enough requires me to retrieve my latest VM backup, else at startup /dev/dri doesn't exist anymore.

    I'm sure those crash now come from the GPU split and passthrough from Proxmox to OMV, and locally built drivers to make it work (i even had to stop the OMV linux kernel from updating, as it didn't recognize the driver, and i couldn't even rebuild the driver for it), but i feel a lot more secure with it running virtualized instead of bare metal (even thought it's totaly unfounded).

    I doubt tmpfs is using 4 gb of page cache. And some things won't give up their page cache easily. Based on your usage in that graph, I would say you need more ram and/or less containers.

    Thank you for the answer.

    When i first search, all i have read is that the cache can be easily flushed, but in my case it doesn't, that's why i was wondering if my use of the flashmemory plugin would change that.

    As you say, it doesn't, and i guess i have to look harder at my container management.

    the flashmemory plugin puts things in tmpfs which does not take precedence on ram allocation. tmpfs will gladly give it memory and use swap if the ram is needed. Without seeing what is using ram on your system, it is hard to say why it is increasing.

    Thx for the reply, the RAM increase is due to some containers, especially one of them.

    What bugs me isn't this, it's the fact my page cache stays at 4+ GB, and that i get warnings and even crashes, even though i still have gigabytes of cache to eventually flush.


    To illustrate, the picture i pasted isn't dramatic when i see it, but the dashboard was telling me i was at 93% usage, well passed the 90% warning

    Hello there, i have another issue on the same RAM usage topic


    I have 16GB RAM, a bunch of containers which take their share of it, thus my RAM usage is usually around 8-10GB.


    The issue i have is that at around 9-9.5GB used, i get the warning about 90% used, and get spammed.

    When it climbs higher (11GB for example), the dashboard tells me i use 95+%, and usually it crashes one of my container, even though i still have more than 3GB of cache remaining.


    Is it due to the flash memory plugin which takes precedence on the RAM allocation ?

    I have the feeling that the root file system or the device is dying because of the very strange behaviour that files disappear or seem to be corrupted.

    Thx for the input, unfortunatly for me it's a very likely explanation, the usb stick used for the system is getting old now. I guess i'll have to replace the stick and do either a clean install or a clone of the existing one which i would then check and repair.

    Make sure the latest package of openmediavault-keyring is installed.


    Bash
    # apt-get install --reinstall --allow-unauthenticated openmediavault-keyring

    Then try to reinstall openmediavault


    Bash
    # apt-get install --reinstall openmediavault

    Thank you for the quick answer, but i get the same error as the apt-get update :

    Code
    : apt-get install --reinstall --allow-unauthenticated openmediavault-keyring
    Lecture des listes de paquets... Erreur !
    E: Could not read from /var/lib/apt/lists/packages.openmediavault.org_public_dists_shaitan_InRelease - getline (5: Erreur d'entrée/sortie)
    E: Les listes de paquets ou le fichier « status » ne peuvent être analysés ou lus.

    But then, i remembered the nice command "omv-aptclean", which resolved the repository issue


    now the errors are different when i update or resinstall OMV :

    Hello there,


    I just noticed this morning, that when i try to connect to the GUI page, i successfully reach the login page, and then when i enter my credentials, the loading goes on until i get the error 504 gateway timeout.


    I looked at my syslog which is full of logs (20 lines each minute in average) from my virtual networks, but nothing about the GUI

    I then looked at engined :

    I then tried to update the packages if it could resolve the issue, but i get another error from "apt-get update" :


    What could be wrong with the shaitan repo and my GUI ?

    You solve one problem, you create two more.. :)


    Story of life.

    So true...

    I wanted to be proactive and spot what i would use (and why not anticipate the replacement, as the SSD connector is quite twitchy), but after many lost hours looking for the impossible/unfindable, now i just tell myself i'll look into it when the day will come.

    I love the 304. My main issue is it only takes ITX boards... which meant only 1 PCIE slot. Unfortunately the board I had in my 304, had a very wonky onboard NIC, so I installed an Intel NIC to resolve that issue. Then I ran out of sata ports and at that point I was stuck. This was no fault of the chassis of course... I could have bought another board with more sata ports and an onboard intel NIC. With most newer boards having mini-pcie slots and it being pretty easy to find a card for them that will hold 2-3 sata ports, that would have been an option. If I'd had a board with 6 Sata ports, I would have been OK.


    Beyond that, fantastic case. I'd probably just go with an 804 however, so I could use a MicroATX board.

    Yeah, i have a microserver (Acer) with an ITX motherboard, and if i have to replace it, it'll be mATX, but the cases are usually too big for the cabinet i use.

    you have a procedure here https://wiki.omv-extras.org/do…openmediavault-compose_67

    and a guide here Guide: Using the new docker plugin

    that explain what you need. Have you read them?

    I looked at the thread, i didn't read specifically those pages (which are easier to read), but the informations are the same.


    My thoughts presented in my previous post are what i understood of the update, i just wanted to confirm i understood correctly.

    Please correct me if i'm wrong :

    1- I updated this morning, all my container still run smoothly

    2- I installed docker compose plugin

    3- In there, i see the docker path is the same as i had before (that's why it still runs)

    4- All the containers are well seen in the stats tab

    5- if i want to remove portainer, i have to redeploy all the stacks and containers using the plugin

    6- if i don't do that, i still need to use portainer to manage everything, or the usual CLI commands.


    Thx in advance.

    It will preserve the configuration of OMV.

    No idea if it will fix your issue, though.

    You were right, it didn't fix anything, as it only reinstall the openmediavault package (i should have known...)


    Anyways, the issue with top is "unknown terminal type" (both in SSH with xterm or GUI with dumb)


    Code
    Failed to execute command 'export PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin; export LANG=C.UTF-8; export LANGUAGE=; top -b -n 1 2>&1' with exit code '1': 'dumb': unknown terminal type.

    or


    Code
    'xterm': unknown terminal type.


    I looked around, in the various folders which would have terminal type, they are all empty... (i don't know what i would normally have in them in a standard OMV installation)

    I come back with news.


    It seems the repair broke more things :

    - top command is unavalaible (xterm issue)

    - omv-firstaid doesn't show up (i guess xterm is the issue too)


    It feels like i would need to have a reinstallation to get everything back (i have many old system saves, but my OMV6 is a OMV5 upgrade, i'm not sure it's really clean)


    I'm wondering if

    Code
    apt-get install --reinstall openmediavault

    would be able to "clean" the installation without having to reconfigure everything (lots of shares, symlinks (mainly to have all docker containers on a SSD instead of the system's usb drive), added packages via apt, etc.)

    I managed to solve it.

    As i discovered that /usr/lib/apt didn't exist, i guessed the fsck repair deleted it.


    I decided to reinstall the apt package completely by downloading the package on the main debian server, and use dpkg.


    After that, omv-aptclean worked like a charm

    Hello everyone,


    A couple weeks ago, all my server services were running smoothly, but i couldn't acces to the GUI (it was always coming back to the login screen).

    Thinking it was a bug due to a previous RAM issue (94% used ponctually of 8GB), i rebooted the server, and it stayed stuck.


    Plugging a screen, i discovered it was stuck at boot which asked me to to a fsck repair of my boot partition. I did and the server rebooted fine, and is running fine since then.


    BUT, since that time, the CRON APT gives me an error, and when i try to manually update, i get the following error (this is with the omv-aptclean):

    I can't reinstall those packages, because it gives me the same error, because he can't read any repository.


    Any idea to reinstall the apt-transport-http (and https) ?


    Thx in advance.