omv-engined segfault error 4 in php7.0

  • Hi OMV enthusiasts,


    I upgrade to OMV 4 some weeks ago (fresh installation since I came from OMV 2) and while trying to remove all the nasty little quirks, I'm still struggling with one remaining issue. I noticed the problem when I was looking at the syslog:


    Code
    kernel: [340459.633054] omv-engined[16216]: segfault at 7faddedfd8c5 ip 00005568810133af sp 00007fff3c080c70 error 4 in php7.0[556880da6000+3aa000]


    I did some searching in the forum and I found various other people having the same message, e.g.
    VirtualBox causing omv-engined[10089]: segfault
    Omv-engined segfault at webui login
    But I didn't find any solution or idea how to track down the problem.


    I performed some tests and I can confirm the observations in the above mentioned forum threads: the issue is definitely connected to the VirtualBox plugin. In addition, I noticed that the syslog entry can be reproduced reliably by the following events:

    • Logging into the OMV Web GUI
    • Clicking on tab "Virtual Machines" in the Services/VirtualBox configuration page

    I tried to use strace to find the root cause of the segfault, but this is the point where I would need some help from more experience hackers. I connected strace to all running php-fpm instances and tried to correlate the output with the occurrence of the syslog message. However, I was not able to clearly identify any problematic kernel calls. So either I connected to the wrong processes or I'm not experienced enough with this kind of analysis (most likely both :rolleyes: ).


    For example, this is the entry in the syslog:


    Code
    Aug 10 17:43:49 hp-microserver kernel: [331988.455725] omv-engined[28852]: segfault at 7f1f797fd865 ip 0000561f514a33af sp 00007ffcfb84b4a0 error 4 in php7.0[561f51236000+3aa000]


    And this is what the strace output of the php-fpm processes looks like: strace.txt


    At time stamp 17:43:49.178916, I can see that the system reacts on my click on tab "Virtual Machines" in the Services/VirtualBox configuration page. But I'm not able to identify any failing kernel calls in this context.


    I would be very glad if someone with more experience would be able to have a look at this trace. I would be available for performing additional traces if necessary.


    In addition, I was wondering if it would be an option to instrument the respective PHP code to locate the origin of the issue. I would assume that this file is somewhat part of the process:
    /usr/share/openmediavault/engined/rpc/virtualbox.inc


    But I'm neither familiar with the OMV internals (omv-engined) nor PHP scripting. So help in this direction is also very welcome.


    Regards,


    André

    • Offizieller Beitrag

    I only see that error once when logging into the web interface for the first time. Because the plugin's tabs are loaded at that time, it is possible that code does not have access to info that it needs. If something wasn't working, I would be more concerned but I just don't have time to track this down.

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

  • i have got same problem


    segfault at 7f9fc582f205 ip 000055d7822ff3af sp 00007ffe6a3120b0 error 4 in php7.0[55d782092000+3aa000]


    after this error


    [11720.161056] perf: interrupt took too long (2518 > 2500), lowering kernel.perf_event_max_sample_rate to 79250
    [17992.531497] perf: interrupt took too long (3157 > 3147), lowering kernel.perf_event_max_sample_rate to 63250
    [26021.303590] perf: interrupt took too long (3951 > 3946), lowering kernel.perf_event_max_sample_rate to 50500
    [50051.330942] perf: interrupt took too long (4940 > 4938), lowering kernel.perf_event_max_sample_rate to 40250
    [161518.448728] perf: interrupt took too long (6200 > 6175), lowering kernel.perf_event_max_sample_rate to 32250
    [267330.224079] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
    [267330.226232] ata2.00: failed command: FLUSH CACHE EXT
    [267330.228420] ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 26
    res 40/00:01:06:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
    [267330.232688] ata2.00: status: { DRDY }
    [267330.234603] ata2: hard resetting link
    [267330.547440] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
    [267330.564774] ata2.00: configured for UDMA/133
    [267330.564782] ata2.00: retrying FLUSH 0xea Emask 0x4
    [267330.564918] ata2: EH complete


    and virtual machines are crashing.



    my mother board supermicro X10SDV-4C-TLN2F

  • Some poking around led me to believe that this is a problem with systemd that is resolved with a later version.


    I don't know where the error was arising, but here's how I got a later version.


    Update (15 hours later). I am still seeing the segfault upon gui login.

  • I'm not using a virtual machine but I came across with this error. And I'm pretty sure that it's because of a harddisk failure. (at least in my case)
    I can hear the mumbling of my harddrive while trying to write data.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!