RAID test failed - please help me understand why

  • Before using OMV in production I wanted to test a rather simple RAID setup, but when simulating a harddisk issue, omv didn't show the array as "not clean" (or whatever is shown in that case) to allow me to click recover, but didn't show the array at all.

    Maybe I did something wrong in my test, although I am not sure what.


    Here's what I did:

    • Setup omv 7 in Proxmox
    • Added 6 additional disks, 1GB each
    • created 3 raid-1 arrays (md0, md1, md2)
    • created 1 VG with those 3 PVs
    • created 1 LV in the VG
    • created 1 encrypted ext4 fs on the LV

    So, pretty simple setup.

    To test the RAID, I booted up the VM with a systemrescue image and wrote 500MB random data onto one of the 6 disks that are used in the arrays.


    Then I booted up omv again. I expected to see my 3 devices (arrays) under Storage -> Multiple Device with one being marked as bad (so that I could click "recover"), but only 2 are shown.


    Maybe my expectations are wrong, but why? Isn't the raid supposed to determine that one disk in the array has an issue (in this case data was destroyed) and allow me to recover the array? The good data is still on the other disk in that array.


    I am sure I could probably ssh into the box and use mdadm to rebuild the array (although I haven't tried that yet), but isn't this one of the main functions of a NAS to get this done in the UI? Well, maybe I'm missing something, so please tell me what was wrong with my test or my expectations.


    Currently I'm just a bit puzzled why the UI doesn't give me the option to recover the broken array.

    • Official Post

    It seems the array was so defective that it was not listed in the /proc/mdstat file. In that case OMV can’t do anything because OMV does not store anything in a database about the array, instead it gets all data in realtime.

  • .

    To test the RAID, I booted up the VM with a systemrescue image and wrote 500MB random data onto one of the 6 disks that are used in the arrays.


    Then I booted up omv again. I expected to see my 3 devices (arrays) under Storage -> Multiple Device with one being marked as bad (so that I could click "recover"), but only 2 are shown.


    Maybe my expectations are wrong, but why? Isn't the raid supposed to determine that one disk in the array has an issue (in this case data was destroyed) and allow me to recover the array? The good data is still on the other disk in that array.



    If you simply remove one of the two drives in a MD RAID1 then of course the RAID should stay on line in "clean, degarded" state. The OMV "recover" option can then be used to add a replacement and return the RAID to a "clean" state.


    But exactly what failure mode were you trying to simulate? I'd guess you use dd and wrote random data from the being of one disk and so have likely corrupted that disk's RAID superblock. So how does MD RAID behave in such circumstances and what can any NAS software do to recover from this?


    If you meant to overwrite some data somewhere other than on the superblock of the disk and so corrupted file data, then what did expect to happen? You may have corrupted the filesystem but not the RAID layer which will report no errors. MD RAID is about data availability, it's not a guarantee of data integrity.


    This is precisely why votdev promoted the use of BTRFS RAID profiles with the functionality added in the later stages of OMV6 and now OMV7.

    BTRFS replaces the need for MD RAID + LVM as the BTRFS filesystem incorporates volume management with all its in-built integrity checks and much more.


    If you want/need encryption, then BTRFS can be created on one or more dm-crypt devcies.

  • Thanks for the reply. To both of you.

    It seems the array was so defective that it was not listed in the /proc/mdstat file. In that case OMV can’t do anything because OMV does not store anything in a database about the array, instead it gets all data in realtime.

    What does that mean exactly? That the array cannot be recovered, even though one disk has all the necessary and correct data? This would mean that my entire VG is dead and thus all data that was on the encypted fs is gone. This sounds rather strange to me. Where is the availability that RAID provides?

    I'd guess you use dd and wrote random data from the being of one disk and so have likely corrupted that disk's RAID superblock. So how does MD RAID behave in such circumstances and what can any NAS software do to recover from this?

    Yes, I did, however I skipped the first few MB to not touch the superblock. But even if the superblock was damaged, it would also be gone if the entire disk was dead. Shouldn't the md raid just copy the superblock from the good disk and resync the data? It is a mirror after all.

    If you simply remove one of the two drives in a MD RAID1 then of course the RAID should stay on line in "clean, degarded" state. The OMV "recover" option can then be used to add a replacement and return the RAID to a "clean" state.

    Nope, I tried that too. Removing a disk by detaching it from the VM has the same effect. The array is missing., and not shown as degraded. Thus no recover button to click either.

    This is precisely why votdev promoted the use of BTRFS RAID profiles with the functionality added in the later stages of OMV6 and now OMV7.

    BTRFS replaces the need for MD RAID + LVM as the BTRFS filesystem incorporates volume management with all its in-built integrity checks and much more.


    If you want/need encryption, then BTRFS can be created on one or more dm-crypt devcies.

    This sounds interesting. I do have experience with ZFS, but I've never used BTRFS.


    My use case is rather simple. I have currently 4x 20TB disks. All of these disks should be used in some sort of array that can be extended. I have gone away from RAID5, because only 1 disk can die at the same time. And when adding more disks in the future, the chances are getting higher that more disks die at the same time.

    I only really need 1 filesystem but it certainly should be encrypted.


    So I have to read up on BTRFS and how to accomplish this.


    With ZFS it would be 2 mirrors in a pool and in the future I could just add more mirrors to that pool. I can't go with ZFS, because using ECC RAM is a pain in the neck. Only AMD Pro CPUs support ECC RAM and then you are also quite limited with the motherboards that support both, the CPU and the RAM. Have you found a combination that works, it's probably 2 to 3 times more expensive than the non-ECC option. However, I don't need a super (Pro) CPU. It's a NAS. I need disk and network throughput. I'm not usng anything but a share on my NAS. I'm not doing transcoding or some other CPU intensive stuff on my NAS. I have my virtualization cluster on separate HW. IMO there should only exist ECC RAM and it should be supported by everything. Anyway, sorry for my rant. It's just a topic that annoys me greatly whenever I think about this ignorant market.


    I will have a look at BTRFS. If you have any suggestions how I can use it for my use-case (the setup I envision) I am more than happy to get some pointers.

  • Only AMD Pro CPUs support ECC RAM and then you are also quite limited with the motherboards that support both, the CPU and the RAM.

    If it's Intel, many times the CPU supports ECC RAM but the motherboard does not. The G4560 I am using can support ECC RAM when installed in Intel® C200 Series Chipsets. Of course, if you're using AMD, I'm not familiar with that.

    OMV 7.x | 6.8 Proxmox Kernel

    GIGABYTE Z370M DS3H Motherboard

    Intel G4560 CPU | 16G×1 Non-ECC RAM

    128G SSD + 1T SSD + 4T×2 HDD | No RAID

    500W ATX PSU | APC BK650-CH UPS

  • tessus If you want some form of real time raid, then a BRTFS RAID1 profile gives you 50% space efficiency but only single disk redundancy. BTRFS is easy to expand (or reduce,if space allows) one disk at a time while remaining online. It is very flexible. For encryption, create dm-crypt devices and then create the BTRFS filesystem on these.


    It is not essential to use ZFS with ECC RAM. Wtih 4x 20TB you could either create a pool of two mirror vdevs, or a single raidz2 vdev. Expand the the pool by either adding another mirror, or in the case of raidz2 the upcoming openzfs 2.3 has implemented raidz expansion! ZFS has native encryption.

  • If you want some form of real time raid, then a BRTFS RAID1 profile gives you 50% space efficiency but only single disk redundancy

    Well, a mirror is always 50% space efficiency. But a mirror is still safer than RAID5. But I have to use multiple mirrors, since a mirror uses only 2 diks. Now, I could use something like a RAID 10, but I am not sure how expandable that is. Afaik, it is not, because striping info would have to be rewritten, but I could be wrong. RAID 10 is certainly much better for performance than just adding several mirrors into a pool or LVM setup, because the disks are sequentially filled in such a case. But definitely easier to expand.

    It's always a trade-off. If I had the time I would run perf tests for all these different setups.


    I still have to read up on BTRFS but I suspect it is similar to ZFS.


    It is not essential to use ZFS with ECC RAM

    I agree to disagree. Aaron Toponce once wrote a really nice ZFS guide many, many years ago. I recall reading about a scenario that made total sense. After reading that I never even thought about using non-ECC RAM with ZFS.

    Unfortunately Aaron's web site has been offline for over a year, but I found a mirror (pardon the pun). Here's the link to that scenario: https://tadeubento.com/2024/aa…d-use-ecc-ram/#a-scenario

  • tessus I'll leave you to read about BTRFS, but a BTRFS RAID1 profile is nothing like a traditional two device based RAID1. It's the "chunk" based space allocation profile and mechanism of BRTFS that gives it raid like properties across multiple devices. You can have two or more drives in a BTRFS RAID1 profile (mirror "chunks") and four or more drives in a BTRFS RAID10 profile ( striped mirrored "chunks").

    .


    BTRFS and ZFS are alike in that they are both COW filesystem with in-built data integrity checks, support snapshots and send/receive between filesystem, but the similarities end there. You can think of BRTFS as a filesystem first and a volume manger second while ZFS is a volume manger first and a filesystem second.


    As to ECC, here's a discussion from 2 days ago: ECC vs Non-ECC RAM for TrueNAS | TrueNAS Tech Talk (T3) E007

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!