RAID Error when rebooting/starting up?

  • Hey guys,


    I seem to have an issue when OMV. I had 3TB drives in RAID 5 and added another 3TB drive. I went into RAID Management and Grew the RAID. The system is unresponsive and I can't login to the GUI and there's no command prompt to run any command. I left the server running all day and these messages keep showing up on the screen (looping every so often) are

    Code
    INFO: task mount:1545 blocked for more than 120 seconds
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message


    Code
    INFO: task md127_raid5: 320 blocked for more than 120 seconds
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message


    Is the RAID damaged? I don't want to reboot the system in case it's doing something, but I noticed very little drive activity from the light on the front of the computer case.


    Any ideas on how to get it to fully boot so I can at least get to a command line? The GUI also doesn't respond.

    • Offizieller Beitrag

    If it won't boot, I would boot systemrescuecd on the system using a usb stick and post the output of the commands in this thread

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

  • So the system boots up and it finds 4 drives in RAID 5. The messages seem to appear when it tries to load the file system.


    It doesn't get to a command line and just hangs.


    Since the system drive is not part of the raid, what would happen if I were to disconnect all the raid drives and load OMV? Would this help with getting the logs?

    • Offizieller Beitrag

    That would probably help it boot but the logs won't help. I need the output from mdadm and other utilities while the drives are plugged in.

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

  • I tried booting from the systemrecuecd usb stick that I created, but when I attempt to boot from it I get an unauthorized device from the UEFI bios screen.So I have clonezilla installed from before and I can access the command line booting with clonezilla and this is what I get when run the commands from the other thread:

    Code
    cat /proc/mdstat
    Personalities :
    md127 : inactive sde[2](S) sda[4](S) sdb[3](S) sdd[1](S)
               11720542048 blocks super 1.2


    Code
    blkid
    /dev/sdb: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="1bf14cbb-0e9d-ad8d-ea46-5ec63f0dff96" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdc1: UUID="f9e1a568-40a8-481f-bbb9-eb408345f58c" TYPE="ext4" PARTUUID="00049aa5-01"
    /dev/loop0: TYPE="squashfs"


    Code
    fdisk -l
    fdisk: cannot open /dev/sde: Permission denied
    fdisk: cannot open /dev/sdb: Permission deniedfdisk: cannot open /dev/sdc: Permission deniedfdisk: cannot open /dev/sdd: Permission deniedfdisk: cannot open /dev/sda: Permission deniedfdisk: cannot open /dev/loop0: Permission deniedfdisk: cannot open /dev/sdf: Permission denied


    I will continue to try and boot using the systemrescuecd software, but in the meantime, if there's something else you can think of that I can use, I'm all ears.


    As you can see nothing is easy when I have to fix something. There's always an issue... Story of my life! ;)

    • Offizieller Beitrag

    I guess clonezilla will work. You need to turn off secureboot (yuck!) to boot systemrescuecd.


    In clonezilla, you could try (use sudo or as root):
    mdadm --stop /dev/md127
    mdadm --assemble /dev/md127 /dev/[abde] --verbose --force

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

  • Oh I'm dumb...


    Code
    md: md127 stopped
    md: unbind<sda>
    md: export_rdev(sda)
    md: unbind<sdd>
    md: export_rdev(sdd)
    md: unbind<sde>
    md: export_rdev(sde)
    md: unbind<sdb>
    md: export_rdev(sdb)
    mdadm: stopped /dev/md127


    Code
    sudo mdadm --assemble /dev/md127 /dev/[abde] --verbose --force
    mdadm: looking for devices for /dev/md127
    mdadm: cannot open device /dev/[adbe]: No such file or directory
    mdadm: /dev/[abde] has no superblock - assembly aborted



    Thanks for your help with this issue...

    • Offizieller Beitrag

    Using sudo, post the output of the commands from this thread now

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

  • Okay here we go:

    Code
    sudo cat /proc/mdstat
    unused devices: <none>


    Code
    sudo blkid
    /dev/sdb: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="1bf14cbb-0e9d-ad8d-ea46-5ec63f0dff96" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdc1: UUID=f9e1a568-40a8-481f-bbb9-eb408345f58c" TYPE="et4" PARTUUID="00049aa5-01"
    /dev/loop0: TYPE="squashfs"
    /dev/sdd: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="bdcc2091-4d65-2715-0001-350c78f8931c" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sde: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="21780219-e063-06f7-1ed8-838372a04351" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sda: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="8b1fd939-c527-78b6-7e40-2467f5ddab6b" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdc5: UUID="0275bbc2-ca86-4c54-921f-9ed87f3d4155" TYPE="swap" PARTUUID="00049aa5-05"



    Code
    sudo mdadm --detail --scan -verbose


    nothing happens after entering this last command.


    I have copied everything from line to line since this is being done straight on the server. So please bare with me as I have typed it all out by hand and not able to copy and paste.


    I assume the code above was supposed to create a log. How do I again access to it? What's the next step?

    • Offizieller Beitrag

    None of them create a log. They should provide output if they find something. You only have one dash on the verbose flag. Is that how you typed it?

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

    • Offizieller Beitrag

    Probably going to need systemrescuecd instead of clonezilla then. Can you disable secureboot (temporarily?)

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

  • Okay here we go, got systemrescuecd to work:


    Code
    cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md127 : active raid5 sdb[3] sda[4] sde[2] sdd[1] 5860270080 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU][>...........................] reshape = 0.0% (714244/2930135040) finish=1712.5min speed=28508K/sec


    Code
    blkid
    /dev/loop0: TYPE="squashfs"
    /dev/sdb: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="1bf14cbb-0e9d-ad8d-ea46-5ec63f0dff96" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sda: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="8b1fd939-c527-78b6-7e40-2467f5ddab6b" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sde: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="21780219-e063-06f7-1ed8-838372a04351" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdd: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="bdcc2091-4d65-2715-0001-350c78f8931c" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdc1: UUID=f9e1a568-40a8-481f-bbb9-eb408345f58c" TYPE="ext4" PARTUUID="00049aa5-01"
    /dev/sdc5: UUID="0275bbc2-ca86-4c54-921f-9ed87f3d4155" TYPE="swap" PARTUUID="00049aa5-05"
    /dev/sdf1: UUID="E8E7-4989" TYPE="vfat"
    /dev/md127: LABEL="Main" UUID="dd3de295-9705-4573-b299-53e77a01fada" TYPE="xfs"



    Code
    cat /etc/mdadm/mdadm.conf
    cat: /etc/mdadm/mdadm.conf: No such file or directory


    Code
    mdadm --detail --scan --verbose
    ARRAY /dev/md/RAID5 level=raid5 num-devices=4 metadata=1.2 name=SAWHOME-Vault:RAID5 UUID=bea4cede:6989d555:c0037b76:5d3428cb
            devices=/dev/sda,/dev/sdb,/dev/sdd,/dev/sde
  • I need to apologise... I'm not sure why the first chunk of code I posted above is saying Brainf**k Source Code. I swear that's not me.


    Anyways...
    So I seem to be getting lots of drive activity from running the

    Code
    cat /proc/mdstat

    command. It looks like it's starting to reshape my RAID (I hope that's a good thing), ran the command again and it's at 2.5% complete. I will leave it for now and await further instructions.

    • Offizieller Beitrag

    The source code label is because it can't figure out what programming language you are using. Not your fault :)


    systemrescuecd automagically recognized your array and if fixing it :) Let it do its job. Check the status like you were doing with cat /proc/mdstat. Reboot into omv when it is done.

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

    • Offizieller Beitrag

    Good to hear :)

    omv 7.0.4-2 sandworm | 64 bit | 6.5 proxmox kernel

    plugins :: omvextrasorg 7.0 | kvm 7.0.10 | compose 7.1.2 | k8s 7.0-6 | cputemp 7.0 | mergerfs 7.0.3


    omv-extras.org plugins source code and issue tracker - github


    Please try ctrl-shift-R and read this before posting a question.

    Please put your OMV system details in your signature.
    Please don't PM for support... Too many PMs!

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!