RAID Error when rebooting/starting up?

  • Hey guys,


    I seem to have an issue when OMV. I had 3TB drives in RAID 5 and added another 3TB drive. I went into RAID Management and Grew the RAID. The system is unresponsive and I can't login to the GUI and there's no command prompt to run any command. I left the server running all day and these messages keep showing up on the screen (looping every so often) are

    Code
    INFO: task mount:1545 blocked for more than 120 seconds
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message


    Code
    INFO: task md127_raid5: 320 blocked for more than 120 seconds
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message


    Is the RAID damaged? I don't want to reboot the system in case it's doing something, but I noticed very little drive activity from the light on the front of the computer case.


    Any ideas on how to get it to fully boot so I can at least get to a command line? The GUI also doesn't respond.

  • If it won't boot, I would boot systemrescuecd on the system using a usb stick and post the output of the commands in this thread

    omv 5.5.6 usul | 64 bit | 5.4 proxmox kernel | omvextrasorg 5.3.5
    omv-extras.org plugins source code and issue tracker - github


    Please read this before posting a question.
    Please don't PM for support... Too many PMs!

  • So the system boots up and it finds 4 drives in RAID 5. The messages seem to appear when it tries to load the file system.


    It doesn't get to a command line and just hangs.


    Since the system drive is not part of the raid, what would happen if I were to disconnect all the raid drives and load OMV? Would this help with getting the logs?

  • That would probably help it boot but the logs won't help. I need the output from mdadm and other utilities while the drives are plugged in.

    omv 5.5.6 usul | 64 bit | 5.4 proxmox kernel | omvextrasorg 5.3.5
    omv-extras.org plugins source code and issue tracker - github


    Please read this before posting a question.
    Please don't PM for support... Too many PMs!

  • I tried booting from the systemrecuecd usb stick that I created, but when I attempt to boot from it I get an unauthorized device from the UEFI bios screen.So I have clonezilla installed from before and I can access the command line booting with clonezilla and this is what I get when run the commands from the other thread:

    Code
    cat /proc/mdstat
    Personalities :
    md127 : inactive sde[2](S) sda[4](S) sdb[3](S) sdd[1](S)
    11720542048 blocks super 1.2


    Code
    blkid
    /dev/sdb: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="1bf14cbb-0e9d-ad8d-ea46-5ec63f0dff96" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdc1: UUID="f9e1a568-40a8-481f-bbb9-eb408345f58c" TYPE="ext4" PARTUUID="00049aa5-01"
    /dev/loop0: TYPE="squashfs"


    Code
    fdisk -l
    fdisk: cannot open /dev/sde: Permission denied
    fdisk: cannot open /dev/sdb: Permission deniedfdisk: cannot open /dev/sdc: Permission deniedfdisk: cannot open /dev/sdd: Permission deniedfdisk: cannot open /dev/sda: Permission deniedfdisk: cannot open /dev/loop0: Permission deniedfdisk: cannot open /dev/sdf: Permission denied


    I will continue to try and boot using the systemrescuecd software, but in the meantime, if there's something else you can think of that I can use, I'm all ears.


    As you can see nothing is easy when I have to fix something. There's always an issue... Story of my life! ;)

  • I guess clonezilla will work. You need to turn off secureboot (yuck!) to boot systemrescuecd.


    In clonezilla, you could try (use sudo or as root):
    mdadm --stop /dev/md127
    mdadm --assemble /dev/md127 /dev/[abde] --verbose --force

    omv 5.5.6 usul | 64 bit | 5.4 proxmox kernel | omvextrasorg 5.3.5
    omv-extras.org plugins source code and issue tracker - github


    Please read this before posting a question.
    Please don't PM for support... Too many PMs!

  • Oh I'm dumb...


    Code
    md: md127 stopped
    md: unbind<sda>
    md: export_rdev(sda)
    md: unbind<sdd>
    md: export_rdev(sdd)
    md: unbind<sde>
    md: export_rdev(sde)
    md: unbind<sdb>
    md: export_rdev(sdb)
    mdadm: stopped /dev/md127


    Code
    sudo mdadm --assemble /dev/md127 /dev/[abde] --verbose --force
    mdadm: looking for devices for /dev/md127
    mdadm: cannot open device /dev/[adbe]: No such file or directory
    mdadm: /dev/[abde] has no superblock - assembly aborted



    Thanks for your help with this issue...

  • Okay here we go:

    Code
    sudo cat /proc/mdstat
    unused devices: <none>


    Code
    sudo blkid
    /dev/sdb: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="1bf14cbb-0e9d-ad8d-ea46-5ec63f0dff96" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdc1: UUID=f9e1a568-40a8-481f-bbb9-eb408345f58c" TYPE="et4" PARTUUID="00049aa5-01"
    /dev/loop0: TYPE="squashfs"
    /dev/sdd: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="bdcc2091-4d65-2715-0001-350c78f8931c" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sde: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="21780219-e063-06f7-1ed8-838372a04351" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sda: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="8b1fd939-c527-78b6-7e40-2467f5ddab6b" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdc5: UUID="0275bbc2-ca86-4c54-921f-9ed87f3d4155" TYPE="swap" PARTUUID="00049aa5-05"



    Code
    sudo mdadm --detail --scan -verbose


    nothing happens after entering this last command.


    I have copied everything from line to line since this is being done straight on the server. So please bare with me as I have typed it all out by hand and not able to copy and paste.


    I assume the code above was supposed to create a log. How do I again access to it? What's the next step?

  • None of them create a log. They should provide output if they find something. You only have one dash on the verbose flag. Is that how you typed it?

    omv 5.5.6 usul | 64 bit | 5.4 proxmox kernel | omvextrasorg 5.3.5
    omv-extras.org plugins source code and issue tracker - github


    Please read this before posting a question.
    Please don't PM for support... Too many PMs!

  • Probably going to need systemrescuecd instead of clonezilla then. Can you disable secureboot (temporarily?)

    omv 5.5.6 usul | 64 bit | 5.4 proxmox kernel | omvextrasorg 5.3.5
    omv-extras.org plugins source code and issue tracker - github


    Please read this before posting a question.
    Please don't PM for support... Too many PMs!

  • Okay here we go, got systemrescuecd to work:


    Code
    cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md127 : active raid5 sdb[3] sda[4] sde[2] sdd[1] 5860270080 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU][>...........................] reshape = 0.0% (714244/2930135040) finish=1712.5min speed=28508K/sec


    Code
    blkid
    /dev/loop0: TYPE="squashfs"
    /dev/sdb: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="1bf14cbb-0e9d-ad8d-ea46-5ec63f0dff96" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sda: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="8b1fd939-c527-78b6-7e40-2467f5ddab6b" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sde: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="21780219-e063-06f7-1ed8-838372a04351" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdd: UUID="bea4cede-6989-d555-c003-7b765d3428cb" UUID_SUB="bdcc2091-4d65-2715-0001-350c78f8931c" LABEL="SAWHOME-Vault:RAID5" TYPE="linux_raid_member"
    /dev/sdc1: UUID=f9e1a568-40a8-481f-bbb9-eb408345f58c" TYPE="ext4" PARTUUID="00049aa5-01"
    /dev/sdc5: UUID="0275bbc2-ca86-4c54-921f-9ed87f3d4155" TYPE="swap" PARTUUID="00049aa5-05"
    /dev/sdf1: UUID="E8E7-4989" TYPE="vfat"
    /dev/md127: LABEL="Main" UUID="dd3de295-9705-4573-b299-53e77a01fada" TYPE="xfs"



    Code
    cat /etc/mdadm/mdadm.conf
    cat: /etc/mdadm/mdadm.conf: No such file or directory


    Code
    mdadm --detail --scan --verbose
    ARRAY /dev/md/RAID5 level=raid5 num-devices=4 metadata=1.2 name=SAWHOME-Vault:RAID5 UUID=bea4cede:6989d555:c0037b76:5d3428cb
    devices=/dev/sda,/dev/sdb,/dev/sdd,/dev/sde
  • I need to apologise... I'm not sure why the first chunk of code I posted above is saying Brainf**k Source Code. I swear that's not me.


    Anyways...
    So I seem to be getting lots of drive activity from running the

    Code
    cat /proc/mdstat

    command. It looks like it's starting to reshape my RAID (I hope that's a good thing), ran the command again and it's at 2.5% complete. I will leave it for now and await further instructions.

  • The source code label is because it can't figure out what programming language you are using. Not your fault :)


    systemrescuecd automagically recognized your array and if fixing it :) Let it do its job. Check the status like you were doing with cat /proc/mdstat. Reboot into omv when it is done.

    omv 5.5.6 usul | 64 bit | 5.4 proxmox kernel | omvextrasorg 5.3.5
    omv-extras.org plugins source code and issue tracker - github


    Please read this before posting a question.
    Please don't PM for support... Too many PMs!

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!