Boot from S3 state failing

  • Hi forum!


    It's been a while since I posted something, in part because my OMV has been working flawlessly. However, I am having some issues currently with rtcwake.


    My server goes in S3 suspend mode daily via a cronjob using rtcwake, the reason being that I am out of home the whole day and keeping it running cost me 120€ of electricity a year. Shutting it down during the day cuts the running costs to ~30€/year. However, since I moved the OMV installation to a new rig I started having issues. The system will suspend as expected, but it will randomly fail to properly wake up. The system wakes up, but it does not boot back online. The system becomes completely unresponsive and I have to pull the plug of the server to hard-reset it.


    The first thing I checked was the logs, but there is nothing in there. I can clearly see when the system goes into suspension, but there is nothing else after that. Therefore, I have some questions to try to narrow the problem:


    1. I have only 2 GB of RAM and no SWAP (I am using an SSD as OS drive and have the flash memory plugin installed). The ram usage is usually around 30% and the OMV installation size is ~12,5 GB. Might it be that I have not enough ram to use suspension?


    2. I moved the OS hard drive from my old rig to the new one. The problem was not present in the old one. The main difference is that in the old one I had a swap partition (I was not using the flash memory plugin). Then, might it be that the lack of swap might be the source of my problem?


    Thanks for the help!

    Custom mini-ITX build
    Coolcube Mini, Intel Desktop Board DQ77KB, Intel Core i7-3770S, 8 GB DDR3 Ram, 64 GB Trascend mSata SSD (OS), X3 1TB HDD pooled + parity

    Dell Optiplex 960 sff (deprecated) - link


    Dell Optiplex FX160 (repurposed) - link


    "If you can't find it in Google, it simply doesn't exist!" - The Internetz


  • I notice the problem when I come back home and try to use Plex. Then I check other plugins and none work. I am guessing that the problem happens when the system goes online (or tries to) because the logs are not registering any system startup or wake up. The cronjob wakes the server at 4 pm everyday, but when the failure happens nothing is registered in the logs, leading me to think that the server wakes up but somehow OMV fails to boot.


    I've not connected a monitor because I can't reproduce the error myself, it happens randomly. I know that none of OMV plugins or services run (everything from nginx to Plex stops working. All plugins are down). On top of that, the power button does not respond either.



    Sent from my Nexus 7 using Tapatalk

    Custom mini-ITX build
    Coolcube Mini, Intel Desktop Board DQ77KB, Intel Core i7-3770S, 8 GB DDR3 Ram, 64 GB Trascend mSata SSD (OS), X3 1TB HDD pooled + parity

    Dell Optiplex 960 sff (deprecated) - link


    Dell Optiplex FX160 (repurposed) - link


    "If you can't find it in Google, it simply doesn't exist!" - The Internetz


  • Just out of curiosity, have you tried putting OMV to sleep / wakeup via a different mechanism - just to rule out a problem with the motherboard.


    I'm using the Autoshutdown & wakealarm plugins on my setup, and they work perfectly...


    I'm a little confused by these comments..

    The system wakes up, but it does not boot back online. The system becomes completely unresponsive and I have to pull the plug of the server to hard-reset it.


    Then I check other plugins and none work


    I've not connected a monitor because I can't reproduce the error myself, it happens randomly


    So, are you saying that the web interface is working, but the rest of the NAS is dead?


    That sounds more like an issue with the HDDs not spinning back up... perhaps a test with something like hdparm -C /dev/sdX??

  • @subzero79


    I am ysing backports kernel (3.16 i think). Will try rolling back to the normal one.


    @BitHoarder
    No, the interface is also deade. No plugin responds. SSH is down, nginx (therefore no web interface), Plex, Samba, Owncloud, etc. Nothing runs when the problem happens. The system wakes up from suspension (the computer is on an running) but the OS does not boot back.


    I reenabled the swap and set the system to hibernate instead. Will see if the failure happens. After work i will check also de kernel and roll it back to the standard one.



    Sent from my LG-H955 using Tapatalk

    Custom mini-ITX build
    Coolcube Mini, Intel Desktop Board DQ77KB, Intel Core i7-3770S, 8 GB DDR3 Ram, 64 GB Trascend mSata SSD (OS), X3 1TB HDD pooled + parity

    Dell Optiplex 960 sff (deprecated) - link


    Dell Optiplex FX160 (repurposed) - link


    "If you can't find it in Google, it simply doesn't exist!" - The Internetz


    • Offizieller Beitrag

    I would still recommend you to leave a monitor plugged. Your fighting blind here...unless you have an ipmi or vpro console, you'll never know what's going on.


    Also consider that if you're using a flash memory so maybe the stick is dying....


    By using the flashmemory plugin and then hard resetting you will never see any backtrace logs that indicate a possible kernel hang, since is not properly shutting down and copying the logs to disk. Maybe consider to disable the plugin and also disable stats monitoring for a day to keep the logs in case of a fail.

  • S3 keeps feeding power to the ram slots, while everything else is off. Swap would be a problem in s4.
    the unresponsive is from coming back online? Or have you actually connected a monitor to find out the server doesn't work?


    I don't mean to hijack this thread.
    I think I have a similar issue...
    I just migrated to SSD, and enabled the flashmemory plugin, and my eth0 is hanging upon rtcwake shutdown.
    I didn't have this issue often with my old HDD.
    I notice that there's no swap partition available in SSD.
    I'm using S4 (suspend to disk).
    If what you said is true, then how can I use the plugin with SSD?
    Use a different state or disable the flashmemory plugin?


    If I enable the swap in /etc/fstab, would it defeat the purpose of the plugin?


    OMV v5.0
    Asus Z97-A/3.1; i3-4370
    32GB RAM Corsair Vengeance Pro

    10 Mal editiert, zuletzt von tinh_x7 ()

  • @tinh_x7


    I have to ask: have you explicitly configured your system to hibernate (suspend to disk) without having a swap partition? In my case, I had to re-enable the swap in order to be able to hibernate. I know that it is possible to achieve S4 without swap, but doing so requires another procedure. In your case the issue might be a hardware limitaion because AFAIK S5 (power off, rtcwake -m off) is not an official S-state and not all motherboards support it (in my case, for example). Are you sure that rtcwake shutdown works? From what I've tested in my rigs and from the rtcwake documentation, shutdown is not a recognized mode. Is the eth0 hanging upon shutdown only or also when going into S3 or S4 states?


    @subzero79


    The system is installed in a SSD and I installed the flash memory plugin to reduce the amount of writes to it. I doubt that the SSD is damaged, I checked the smart values and everything seems OK. As you recommended, I disabled the plug in and re-enabled the swap partition. I changed the suspension mode from S3 to S4 (hibernate) to see if that helps.


    As tinh_x7 said, does enabling the swap defeats the purpose of the flash memory plugin?

    Custom mini-ITX build
    Coolcube Mini, Intel Desktop Board DQ77KB, Intel Core i7-3770S, 8 GB DDR3 Ram, 64 GB Trascend mSata SSD (OS), X3 1TB HDD pooled + parity

    Dell Optiplex 960 sff (deprecated) - link


    Dell Optiplex FX160 (repurposed) - link


    "If you can't find it in Google, it simply doesn't exist!" - The Internetz


  • I'm going to enable the swap partition tonight to see if it would should down.
    I've created a cron job for rtcwake to shut it down on daily basis, but the NIC got hanging upon shutdown.
    Therefore, I have to shut it down manually, and turn it on the next day.

    OMV v5.0
    Asus Z97-A/3.1; i3-4370
    32GB RAM Corsair Vengeance Pro

  • What command are you using to shut your system down? So, the NIC hangs before the computer shuts down? Does it also happen if you manually shut down the server (or if you send poweroff thru CLI)?

    Custom mini-ITX build
    Coolcube Mini, Intel Desktop Board DQ77KB, Intel Core i7-3770S, 8 GB DDR3 Ram, 64 GB Trascend mSata SSD (OS), X3 1TB HDD pooled + parity

    Dell Optiplex 960 sff (deprecated) - link


    Dell Optiplex FX160 (repurposed) - link


    "If you can't find it in Google, it simply doesn't exist!" - The Internetz


  • This is the thread that I've created: Detected Hardware Unit Hang.
    At first I'm thinking the NIC is the culprit b/c that's what I saw on the shutdown screen.
    Then I updated the NIC driver, and tried several kernels, but no luck.
    Therefore, I thinking the swap partition missing probably is the culprit.


    I usually run shutdown -h now when it doesn't shutdown.
    This is an example of my cron job in /etc/crontab:

    Code
    # Monday-Thursday: sleep @ 10:40PM, wake up @ 4PM.
    40 22 * * 1-4 root /usr/sbin/rtcwake -m disk -s 62400

    OMV v5.0
    Asus Z97-A/3.1; i3-4370
    32GB RAM Corsair Vengeance Pro

    2 Mal editiert, zuletzt von tinh_x7 ()

  • Code
    00 22 0 3 root .... rctwake -m disk - s 34000


    This does not shut down the system, it makes it hibernate. You can't hibernate without having a swap partition. If you enable swap and run the command again, chances are that the problem will be gone.


    If you want to shut down the computer using rtcwkake, you need to use:


    Code
    rtcwake -m off -s 34000

    Custom mini-ITX build
    Coolcube Mini, Intel Desktop Board DQ77KB, Intel Core i7-3770S, 8 GB DDR3 Ram, 64 GB Trascend mSata SSD (OS), X3 1TB HDD pooled + parity

    Dell Optiplex 960 sff (deprecated) - link


    Dell Optiplex FX160 (repurposed) - link


    "If you can't find it in Google, it simply doesn't exist!" - The Internetz


  • From the rtcwake documentation: "(off is) Not officially supported by ACPI, but usually working". off is not a real ACPI power state, and it depends on which motherboard/bios you have. Some support it, some don't. My computer, for example, does not support it. Using rtcwake off in my systems prompts me to use shutdown instead, because off is not supported by my motherboard/bios:


    Custom mini-ITX build
    Coolcube Mini, Intel Desktop Board DQ77KB, Intel Core i7-3770S, 8 GB DDR3 Ram, 64 GB Trascend mSata SSD (OS), X3 1TB HDD pooled + parity

    Dell Optiplex 960 sff (deprecated) - link


    Dell Optiplex FX160 (repurposed) - link


    "If you can't find it in Google, it simply doesn't exist!" - The Internetz


  • There you go! If you want to shut down, you have to use another command. Try enabling swap and running the cron job. Perhaps that solves your issue.


    Have you try the work around method yet?
    wiki.debian.org/Hibernation/Hibernate_Without_Swap_Partition


    Nope, didn't had the time yet. I want to see first if the problem I am having happens also with hibernation.

    Custom mini-ITX build
    Coolcube Mini, Intel Desktop Board DQ77KB, Intel Core i7-3770S, 8 GB DDR3 Ram, 64 GB Trascend mSata SSD (OS), X3 1TB HDD pooled + parity

    Dell Optiplex 960 sff (deprecated) - link


    Dell Optiplex FX160 (repurposed) - link


    "If you can't find it in Google, it simply doesn't exist!" - The Internetz


  • I'm thinking regardless if we enable the swap on the SSD or the work around, the swap requires to write onto something unless we created the swap partition on a dedicated hard drive.

    OMV v5.0
    Asus Z97-A/3.1; i3-4370
    32GB RAM Corsair Vengeance Pro

    • Offizieller Beitrag

    As tinh_x7 said, does enabling the swap defeats the purpose of the flash memory plugin?


    I think it does, think about dumping all RAM to disk (writes).


    If you think networking is an issue, try to use the pm/sleep.d hooks scripts to remove/insert the ko module and restart networking. I do that for example to avoid the realtek shutdown/reboot bug when wol is enabled.

  • I don't think it would use all RAM, only half, for the swap partition.
    But swap only activates if the system runs out of RAM.


    FYI: I'm using Intel NICs.
    How do you use the pm/sleep.d scripts ?


    OMV v5.0
    Asus Z97-A/3.1; i3-4370
    32GB RAM Corsair Vengeance Pro

    Einmal editiert, zuletzt von tinh_x7 ()

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!