mdadm has removed the drive could be due to errors/bad sectors, you should check storage -> smart on the drive that's been removed
you should also check out this section in the user guide
mdadm has removed the drive could be due to errors/bad sectors, you should check storage -> smart on the drive that's been removed
you should also check out this section in the user guide
post the output of mdadm --detail /dev/md0
You can double check the drive removal by running mdadm --detail /dev/md0 from the cli
If you see 'removed' next to /dev/sdb then mdadm has remove the disk from the array, all you need to do is to replace the drive;
Shut down and remove the failed drive
Install a new drive -> start up
Storage Disks -> select the drive and do a quick/short wipe, this will prepare the drive for OMV to use
Raid Management/Md - Plugin there should be an option 'recover', click on that and your new drive should be displayed, select it and click OK the drive will then be added to the array
This sounds like you're using repurposed hard drives, if you used a short/quick wipe on each drive before creating the array there can be residual data/signature from previous use.
The only way to resolve this is delete the current array and secure wipe each drive and try again, but in it's current state there is nothing to be done.
Footnote: it has been suggested on here that you can use a secure wipe to approx. 25%, then stop/cancel it and sometimes this works
Is it better to proceed immediately with the disk replacement
Yes, the standout there is 198 -> Offline uncorrectable
The file system is still unmounted.
?? I would have expected that to be mounted once the array was reassembled, you could try mount -a from the cli or reboot now the array has rebuilt (reboot is what I would do)
What would be the correct procedure for the replacement?
I knew you would ask me that and TBH I don't know without running up a VM as the layout of the raid page changed slightly since I last used it. AFAIK you should be able to do all this from the GUI, so from Raid Management there should be an option/button/icon to remove a drive, this will run a script which will fail the drive and remove it from the array. The array will then show as clean/degraded.
You then shut down remove that failed drive (this is the squeaking bum moment, double/triple check that it's the right drive), yes users have removed the wrong one insert the new drive, then from storage -> disks, select it and do a short/quick wipe. Then Raid Management and there should be a recover option, click on that select the new drive and click OK, the array should rebuild
There's a lot of 'should's' in there that's just me being pessimistic
What do you recommend to do now
I would suggest doing a short smart test on each of the drives and check 5, 187, 188, 197, 198 you're looking for any raw values in those drives, if there are any I would replace the drive, likewise if smart in omv's gui shows anything about bad sectors, replace the drive
At the moment you're back up and running, as for backup start with what you don't want to lose
Now I think I understand that we can only wait and cross our fingers
Yep, hopefully is should be OK that's why one should have a backup
this is the output:
Excellent, solved that problem, now for sdd
mdadm --zero-superblock /dev/sdd not sure if there will be/is any output from this
mdadm --add /dev/md0 /dev/sdd check the output with cat /proc/mdstat the array should be rebuilding and adding sdd back to the array
BTW DO NOT REBOOT, SHUTDOWN OR PASS GO
Why was the /dev/sdd disk not added
TBH I've never really understood the error (possibly out of date) I've always assumed that some sort of error occurred but mdadm doesn't actually remove the drive from the array.
Do one thing at a time, the output shows the array as active (auto-read-only) but with only 3 drives, from the cli run;
mdadm --readwrite /dev/md0 hopefully that will correct the (auto-read-only) cat /proc/mdstat should confirm that
Any help from you is appreciated.
Your array is inactive you will need to ssh into omv as root and from the cli run;
mdadm --stop /dev/md0 wait for confirmation, then
mdadm --assemble --force --verbose /dev/md0 /dev/sd[abcd] you can check the output by running
cat /proc/mdstat
Any guidance for connecting this existing, populated array to OMV will be greatly appreciated
OMV does not support Raid on usb devices
The fact that you can 'see' the array after installing the md plugin means that the plugin is reading the mdadm signatures on those drives
Why do you think you can mount the array again after you have mounted it from the cli, OMV does not mount drives/arrays under /mnt
When both cards are plugged together, only the network card works, no matter which lane it's connected
Then that would suggest the network card is 'hogging' (sorry best way I can explain it) the single lane and the m'board, chipset, cpu has no way of splitting/sharing that single lane between 2 pcie devices.
My first thought with this was the labelling on the board 'might' reference the boot/bios preference for each slot
Any idea how to troubleshoot this
TBH I would start with the configuration of the m'board, you don't state in your post what is plugged into where, your board has 1xPCIe x16 slot and 2xPCIe x1. The 2.5Gbps network card is PCIe x1, if this is plugged into one of PCIe x1 slots and the sata card into the other PCIe x1 there could be a conflict, likewise the same if the PCIe x16 slot is being used
If there is nothing in the manual then the ASRock support forum or a process of elimination of the hardware would be the best way forward
EDIT: Looking at the images for that board the PCIex1 slots are labelled PCIE1 and PCIE3, that might be relevant
PS. Disks quick wiped before operation.
This is the issue, the 2 x 4Tb drives were obviously repurposed, the quick wipe sometimes does not remove all signatures/references on a drive, hence the p1 and p3 raid references.
Simplest and quickest option is to remove all the raid references for the 2 x 4Tb, secure wipe each drive and try again
That works, copy and paste the output from echo -e [Unit]
It is just making a directory
Yes, there's no override.conf file
You must have run it before I added the mkdir line.
Yes, but have tried again, same error even with the mkdir
/etc/systemd/system/folder2ram_startup.service.d/override.conf
-bash: /etc/systemd/system/folder2ram_startup.service.d/override.conf: No such file or directory
Is the output of line 2
Yes, that works!! but still the nginx.service fails to start, this only occurs after a reboot or shutdown then restart
When you finally get into the GUI both docker and file browser services are loaded, if these are delayed until after the GUI is loaded then there are no nginx.service failed errors on boot
V5/6 never had this nginx.service issue
ryecoaaron I found this issue from the original upgrade failure, crashtest had sent me his upgrade procedure which I followed it all appeared to go well, including no grub-pc errors, until the last line;
E: Failed to fetch http://download.proxmox.com/debian/pve/dists/bookworm/pve-no-subscription/binary-amd64/proxmox-kernel-6.2.16-20-pve_6.2.16-20_amd64.deb Error reading from server. Remote end closed connection [IP: 51.91.38.34 80] this is right after Restarting engine daemon, so there were no kernels /boot
I then remembered your post here and ran omv-installproxmox 6.8 this appears to have worked, I am yet to reboot