Random reboots on omv4

  • Hello again everybody, my omv installation is on top of an Orange Pi PC2, omv4 and up to last night it was working with some random slowdowns every few hours maybe 2-5 minutes of massive slowdown then it started to move again, i didnt update the system for about 10 days, last night i updated and now the system works a LOT faster but i get random reboots every few hours (every 2-4 hours mostly), im not very tech savvy, at most an advanced user, so i dont know where to look or what to check to see whats wrong, i did a "/var/log/messages" and i see a lot of

    Code
    Jan 31 11:17:29 omv kernel: [ 1260.105731] BTRFS warning (device mmcblk0p2): csum failed root 306 ino 4407 off 2137825280 csum 0xdc2523f0 expected csum 0x1b9ce9


    Is it really just a warning or should i be worried about the SD card health?
    This is a the part where i think something happens that start the restart but i see nothing different compared as example with 15 min or half hour before, the network devices messages seem like some docker thing happening in the background but also, repeating so i guess not critical, and the btrfs warning seem also non critical since it repeat in several places, i might be wrong:




    Any ideas? thanks in advance for any help.


    IMPORTANT NOTE: im not complaining at all, dont get me wrong, if at all i want to discover what causes this to help the developers (if its a bug) or somebody else in the future with the same problem (if it is a setup/hardware problem)

  • Also i have this earlier but no reboots for another half hour there, so im not sure if critical:


    I can upload the complete log somewhere but since i updated the system recently and also i corrected the timezone in the meanwhile its kinda nasty, tell me how do you guys handle log uploads or what logs do you guys need and ill do it asap...thanks in advance for any help.

  • Is it really just a warning or should i be worried about the SD card health?

    This is filesystem corruption maybe related to a broken or counterfeit SD card. Anyway, the first thing is to see how severe your rootfs is corrupted:


    Code
    sudo btrfs scrub /

    If you see here more than 10 entries you need to start over from scratch anyway since a corrupted rootfs leads to all sorts of strange problems that only increase (kernel panics followed by reboots included -- any more diagnosis is a useless waste of time once the rootfs is corrupted)


    I hope you carefully followed these steps (checking for counterfeit SD cards and only using Etcher to write the image)?

  • maybe some typo?



    I didnt do the counterfeit check since the package and all about the card seemed legit but i guess ill do it anyway now, i did format complete and then write it with etcher tho...only one question, either of those apps overwrites the system?


    no flash plugin, i have the option to add it, i though that it came installed on sbc images by default but apparently not...

    2 Mal editiert, zuletzt von Trash_Can_Man () aus folgendem Grund: i was wrong, i dont have the flashmemory plugin, for some reason i though it came by default on these images

  • maybe some typo?

    No, not remembered correctly but the error output already pointed you in the right direction:

    Code
    btrfs scrub -B start /


    (the '-B' is explained in the manual page as well). In case you create a separate user and login as this user you could also check the SD card on the OMV images for ARM in a running system with


    Code
    armbianmonitor -c $HOME
  • hum, again? the / simply replace the device?


    After jumping in a nasty way over linux permissions hell (i created an user but armbianmonitor asked for the home folder, created the folder with root but had no permissions to write to it on the user, anyway, ill clean up the mess after, lol) i got after a long wait:



    So i guess i do have a fake one after all, or a very bad one (is a Kingston C10 with no guarantied speed marks on the blister, if not fake it must be barelly c10)
    But apparently have no errors, im going to replace it anyway for some sandisk 80mb/s, im not sure if i can get samsung EVO+ locally, probably not...for now im gonna rule this as kernel panics because of slow card i guess and im gonna report in a few days when i install all back from scratch on a new card...thanks in advance for the help :D

  • Ok that last one worked:


    Code
    root@omv:~# btrfs scrub start -B /
    scrub done for 6d0d4c98-6bbf-418c-9501-7667708a6fe7
            scrub started at Wed Jan 31 16:38:51 2018 and finished after 00:03:48
            total bytes scrubbed: 2.21GiB with 1 errors
            error details: csum=1
            corrected errors: 0, uncorrectable errors: 1, unverified errors: 0
    ERROR: there are uncorrectable errors

    Anyway im going to replace the card but i wanted to post the result just in case someone else wanders around here in the future, again, thanks for all the help.

  • Im back to report that changing the SD card did solve my original post problem, its been 6 solid hours runing overnight, i need to pay attention to it just to be sure but apparently its solved, but i have a doubt now, i was unable to install docker on 3.x (it was giving some error about not having available an arm64 version of docker in the repos and i really need docker) so i went to 4.x once again, first i tried to upgrade from the 3.x image and i was getting weird repos errors after runing omv-release-upgrade, using the ui i installed a few packages at a time until everything was installed so i have the feeling that the problem is some weird package order problem, anyway it felt a dirty installation so i started all over again using the armbian-config-softy-omv method (burn armbian server, run armbian-config) the script is kinda weird but runs without errors, the install went smooth, no errors, installed the docker plugin, my dockers everything was fantastic, i have only one issue with this method, for some reason it dont install zram, left me with a 128mb swap partition, im not sure if you guys now set the script that way or not (and imho these toy boards as tkaiser call em need zram) but i did some google fu and did this:


    Code
    wget https://mirrors.kernel.org/ubuntu/pool/universe/z/zram-config/zram-config_0.5_all.deb
    dpkg -i zram-config_0.5_all.deb

    restarted twice and i now have double swap aparently, the normal 128mb (with -5 swappines) and zram (with 0 swappiness), i have no problem with having double, maybe for extreme situations it would help (and yes, after 6 hours swapon reports 0mb swap used and 90mb on zram (22mbx4)), and my question is: do i need to set up zram in any particular way or leaving it this way is the most efficient way i can use it? does it have the best compression method set by default? and another one, after 6 hours the ui report 65% memory usage, is it safe to move it to about 80% adding another docker?


    News: just got a massive slowdown, letting it stay to see if temporary or permanent, reporting in a few mins


    Followup: the base os kept runing apparently all night as its uptime reports 6:50 but there are massive slowdowns apparently, and i got 2 of the dockers stopped, im not sure if the problem is within the os, in the docker engine or in some of my dockers, var/logs/messages dont show anything to mention at about the last slowdown hour tho i keep getting those weird macvlan <info> lines and no more btrfs csum warnings anymore...i was opening the omv webui when this slowdown happened (a second before i loged in and was loading the dashboard screen) (and i think i was browsing or doing stuff on the ui the last few times i had problems) so i wonder if im seeing some super random bug with the ui...also its a slowdown, can the ui cause a big enough memory spike to put the board on swap hell for a pair of minutes? it have to be a very big spike since the board on my setup stays at about 60% memory used...

  • So i left htop open on a terminal window to watch it now and then while i work and do other stuff, and catch the exact moment when a massive slow down happened, (why in hell every weird problem happens to me? makes me feel like Bad Luck Brian) and seems like the culprit is sickrage, the exact moment the problem was starting sickrage peaked on memory usage to very crazy levels and puts the board on some form of swap hell (tho it still never touches disk swap), im going to let it run a few hours without sickrage to see if something else triggers it, also i found this wich appear to be my problem, going to report back in a few hours, sorry for wasting everyones time, my luck really stink with this board apparently, maybe its destiny telling me to buy an odroid xu4 lol

  • maybe its destiny telling me to buy an odroid xu4

    How should exchanging hardware fix software (whatever sickrage is)? If you're running low on memory then switching to a device with more DRAM is a good idea if the software can not be fixed. But then I would better look for boards that allow a bit more than just 2 GB (eg. ROCK64)

  • It was just a joke but yea, opipc2 have 1gig and xu4 have 2 gig so my joke does apply in this contexts, also if i understand the armbian lists correctly the xu4 is in a stable state while the rock64 is a work in progress, maybe im wrong but "work in progress" dont mean that it have nasty bugs yet?


    ps: ive been looking at reviews, the hardware, distros and even checked for local retailers (offering the 4gb version ofc) and im REALLY liking the idea, in all honesty, the rock64 is even cheaper than the xu4, how stable is omv on this board? any nasty (or at all) bugs i should know of? what about docker? does it work on it?
    ps2: answering my own question: Docker GUI plugin for armhf on OMV 3 and @tkaiser dude you are a genious, seriously tyvvvvm, now i need to get myself a rock64, seems super solid for what i need...

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!