OMV does not boot when all RAID drives are connected

jonni · 4. Februar 2021

I was using OMV4 until a couple days ago, but my RAID 5 with 4 hard drives was not showing up in my OMV configurations any longer.

I then upgraded (reinstalled) to OMV5, but my RAID is still not showing up.

I tried to fix it through mdadm assebmling and it actually went through successfully (at least it looked that way to my newbie eyes), but as I tried to restart the machine, it would not even finish booting up. It gets stuck after init ramdisks with 4 errors "ataX: softreset failed (device not ready)".

As soon as I unplug one of the drives it shows only 3 errors and finishes booting afterwards.

These three drives (plus my SSD) show up in OMV gui "Disks".

geaves · 5. Februar 2021

Need some information about hardware and the output from each of these cli commands

macom · 5. Februar 2021

I guess geaves is referring to the questions from here:

Degraded or missing raid array questions

geaves · 5. Februar 2021

Missed adding the link

crashtest · 5. Februar 2021

Zitat von geaves

Missed adding the link

jonni · 5. Februar 2021

Thanks macom for the clarification. I was a little confused.

I have booted my OMV with only 4 drives (3 HDD plus SSD with the OS) instead of all 4 RAID HDDs as it does not finish booting with 4 HDDs.

Used RAID HDDs:

TOSHIBA DT01ABA3 (3tb, 3 drives)

Seagate IronWolf NAS HDD (3tb, one drive)

Code

root@nas-Jonathan:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : inactive sdb[2](S) sda[0](S) sdc[1](S)
      8790402407 blocks super 1.2
       
unused devices: <none>

Code

root@nas-Jonathan:~# blkid
/dev/sdb: UUID="1ae7b1fb-004c-10d2-5ee6-31b56d43d6a5" UUID_SUB="c1be727c-965a-bc95-de8f-4c699d4c72c4" LABEL="nas-jonathan:MeinRAID" TYPE="linux_raid_member"
/dev/sdd1: UUID="68bfc5a4-6add-4b8e-8e8a-f6965765bbc8" TYPE="ext4" PARTUUID="6ec3d6f8-01"
/dev/sdd5: UUID="26b08f75-df65-45ea-9e43-8d38352f5629" TYPE="swap" PARTUUID="6ec3d6f8-05"
/dev/sda: UUID="1ae7b1fb-004c-10d2-5ee6-31b56d43d6a5" UUID_SUB="f8ffd68f-5af4-7bb5-2f32-a68089552676" LABEL="nas-jonathan:MeinRAID" TYPE="linux_raid_member"
/dev/sdc: UUID="1ae7b1fb-004c-10d2-5ee6-31b56d43d6a5" UUID_SUB="4e51ff5c-b622-f65f-f29f-137b56169223" LABEL="nas-jonathan:MeinRAID" TYPE="linux_raid_member"

Code

root@nas-Jonathan:~# fdisk -l | grep "Disk "
Disk /dev/sdb: 2,7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk model: TOSHIBA DT01ABA3
Disk /dev/sdd: 111,8 GiB, 120033041920 bytes, 234439535 sectors
Disk model: Samsung SSD 840 
Disk identifier: 0x6ec3d6f8
Disk /dev/sda: 2,7 TiB, 3000588754432 bytes, 5860524911 sectors
Disk model: TOSHIBA DT01ABA3
Disk /dev/sdc: 2,7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk model: ST3000VN007-2E41

Code

root@nas-Jonathan:~# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# !NB! Run update-initramfs -u after updating this file.
# !NB! This will ensure that initramfs has an uptodate copy.
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays

# This configuration was auto-generated on Mon, 21 Sep 2020 15:28:38 +0000 by mkconf

Alles anzeigen

Code

root@nas-Jonathan:~# mdadm --detail --scan --verbose
INACTIVE-ARRAY /dev/md127 num-devices=3 metadata=1.2 name=nas-jonathan:MeinRAID UUID=1ae7b1fb:004c10d2:5ee631b5:6d43d6a5
   devices=/dev/sda,/dev/sdb,/dev/sdc

geaves · 5. Februar 2021

Zitat von jonni

I was a little confused.

That was my fault I thought I had copied the link in, with the above it clearly shows the raid as inactive, the concern might be why the fourth drive is causing an issue.

mdadm --stop /dev/md127 mdadm --assemble --force --verbose /dev/md127 /dev/sd[abc]

jonni · 5. Februar 2021

Result:

Code

root@nas-Jonathan:~# mdadm --stop /dev/md127
mdadm: stopped /dev/md127

Code

root@nas-Jonathan:~# mdadm --assemble --force --verbose /dev/md127 /dev/sd[abc]
mdadm: looking for devices for /dev/md127
mdadm: /dev/sda is identified as a member of /dev/md127, slot 0.
mdadm: /dev/sdb is identified as a member of /dev/md127, slot 2.
mdadm: /dev/sdc is identified as a member of /dev/md127, slot 1.
mdadm: added /dev/sdc to /dev/md127 as 1
mdadm: added /dev/sdb to /dev/md127 as 2
mdadm: no uptodate device for slot 3 of /dev/md127
mdadm: added /dev/sda to /dev/md127 as 0
mdadm: /dev/md127 has been started with 3 drives (out of 4).

The reboot afterwards took about 10-15 minutes, but it did succeed.

The RAID does still not show up in the OMV gui "RAID management" though.

geaves · 5. Februar 2021

Zitat von jonni

The reboot afterwards took about 10-15 minutes, but it did succeed.

WHY!!!! did I say reboot, at this present moment, you nor I have no idea what is going on, quite frankly 10-15 mins is a very, very serious issue. The assemble should have rebuilt the array, if it was doing that and you rebooted I have no idea what's happened.

Output of cat /proc/mdstat

jonni · 5. Februar 2021

Zitat von geaves

WHY!!!! did I say reboot,

OK, I will leave it running then from now on.

When it was rebooting the reboot was not going through because of a task that was blocked. I am not too sure whether that task was finished or terminated after 10-15 minutes.

Code

root@nas-Jonathan:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : inactive sdb[0](S) sdc[2](S) sdd[1](S)
      8790402407 blocks super 1.2
       
unused devices: <none>

geaves · 5. Februar 2021

Back to square one!! re do what I posted in #7 do not reboot, do not pass go post the output of cat /proc/mdstat after the assemble

jonni · 5. Februar 2021

As sdX names changed, I altered the assemble command as well.

Code

root@nas-Jonathan:~# mdadm --stop /dev/md127
mdadm: stopped /dev/md127

Code

root@nas-Jonathan:~# mdadm --assemble --force --verbose /dev/md127 /dev/sd[bcd]
mdadm: looking for devices for /dev/md127
mdadm: /dev/sdb is identified as a member of /dev/md127, slot 0.
mdadm: /dev/sdc is identified as a member of /dev/md127, slot 2.
mdadm: /dev/sdd is identified as a member of /dev/md127, slot 1.
mdadm: added /dev/sdd to /dev/md127 as 1
mdadm: added /dev/sdc to /dev/md127 as 2
mdadm: no uptodate device for slot 3 of /dev/md127
mdadm: added /dev/sdb to /dev/md127 as 0
mdadm: /dev/md127 has been started with 3 drives (out of 4).

Code

root@nas-Jonathan:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : active (auto-read-only) raid5 sdb[0] sdc[2] sdd[1]
      8790405120 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>

geaves · 5. Februar 2021

OK that's looking better, now mdadm --readwrite /dev/md127 cat /proc/mdstat

jonni · 5. Februar 2021

I have now left the readwrite command running for about 30-40 minutes, but for some reason it has not finished so far.

How much time should I give OMV to complete that command?

geaves · 5. Februar 2021

Zitat von jonni

How much time should I give OMV to complete that command

Ok this is english sarcasm -> until it's finished there's nothing more to be done, and

jonni · 6. Februar 2021

Sorry, my English got a little rusty over the years.

I have left the system running over night (14 hours +) and the state has not changed yet. The command is still pending.

Any advice on how long I should leave it like that before giving up?

geaves · 6. Februar 2021

Zitat von jonni

The command is still pending

When you say pending what is it displaying, TBH that --readwrite option should be instantaneous, but having no idea what else is happening that is why I said leave it.

geaves · 6. Februar 2021

It should be running a resync

jonni · 6. Februar 2021

Zitat von geaves

When you say pending what is it displaying, TBH that --readwrite option should be instantaneous, but having no idea what else is happening that is why I said leave it.

It has not shown anything after I executed the command nor can I input any new commands as it is (apparently) still processing the readwrite command.

It looks like this at the moment:

Code

root@nas-Jonathan:~# mdadm --readwrite /dev/md127
"empty line with flashing symbol"

geaves · 6. Februar 2021

have you inserted line 2 for reference to what it's doing?

OMV does not boot when all RAID drives are connected

Jetzt mitmachen!

Tags