What I learnt regarding SSD+TRIM and (mdadm)RAID5+LVM+ext4

  • Have re-purposed 4x WDC WDS100T1B0A (1TB) SSD Drives running (mdadm)RAID5+LVM+ext4.


    This is just a collection of what I learned - so I can find it next time I go looking... and it may also be useful for someone else.


    Using re-purposed consumer gear is normally OK for home use - as long as you test, test, test and test some more.
    Left the disks running in a (mdadm)RAID5 only array for a few months on light duties - in that time one drive controller died; was replaced under warranty.
    Ran the array for a another few months to get past the infant mortality hump.


    Setup the latest OMV - setting up full disk (mdadm)RAID5+LVM+ext4 with a 800GB LV for VM images.


    However, when checking TRIM support - I get fstrim providing "not supported errors"... which is not surprising given the lsblk-D below (DISC-GRAN/DISC-MAX=0=unsupported):


    The WD drives support Discard:

    From google-foo research:
    LVM has supported Discards since: 2011(?)
    (mdadm)RAID456 has supported Discards since: 2016(?)


    Tried different kernels - lsblk -D displayed different confusing results including showing support... but fstrim still gave errors.


    A lot of the guides unearthed by Google-foo - say that lvm.conf needs to be modified... in reality not. Only needed if you wish for Discards to be issued during lvremove/vgremove. Has no bearing on LVM transparently passing Discards down the stack. Under testing (with my setup) takes ~1.5hrs to fstrim a 800G LV and ~3.5hrs to lvremove a 1.9TB LV. So best to only use LVM issue_discards=1 when you _really_ need to then disable. (i.e. immediately after creating the RAID5+LVM stack for a clean/forced empty array).

    Code
    /etc/lvm/lvm.conf
    issue_discards = 1

    After more google-foo - came across https://current.workingdirectory.net/posts/2016/ssd-discard/ which points out devices_handle_discard_safely=Y
    I also learn that RAID5 disables this as default as there is no way (mdadm)RAID5 can test the SSD Disks to confirm correct handling so enabling is a manual admin action. Verification/testing is the responsibility of the admin.

    Code
    /etc/modprobe.d/raid456.conf
    options raid456 devices_handle_discard_safely=Y

    Any changes to /etc/lvm.conf and/or /etc/modprobe.d/raid456.conf will need the following before any new/different settings take effect:

    Code
    update-initramfs -u
    reboot

    Enabled devices_handle_discard_safely can be verified enabled by:

    Code
    root@mama:~# cat /sys/module/raid456/parameters/devices_handle_discard_safely
    Y

    and

    After setting up the RAID5 array - doing something like below will make sure the array is clean/forced empty (may take a _long_ time to complete; so run via screen).


    Code
    vgcreate vg_realname /dev/md0
    lvcreate -l 100%FREE -n lv_trimtest vg_realname
    lvremove lv_trimtest

    Then test, test, test and test some more before committing real data.

  • 4x WDC WDS100T1B0A (1TB) SSD Drives running (mdadm)RAID5+LVM+ext4

    This is really dangerous. You can't adopt concepts made for spinning rust to modern flash storage.


    HDD die for completely different reasons than modern flash storage. A HDD will usually die without warning due to physical damage. With flash storage products they will suffer either from firmware bugs or wear out or physical damage as well (power spike or something like this).


    The risk of a firmware bug and the principle of wearing out identical with identical access patterns will result in a bunch of identical SSDs dying at (almost) the same time (at least when running identical firmwares). And that's something traditional/anachronistic RAID won't protect against. Same with RAID-1 or mirrors made out of identical SSDs --> bad idea. RAID-5 even makes no sense at all in my opinion...

  • This is really dangerous. You can't adopt concepts made for spinning rust to modern flash storage.
    HDD die for completely different reasons than modern flash storage.

    Currently you'd find EMC, NetApp, HDS, HP, Huawei & IBM would seem to disagree...


    I'd be interested in any real modeling or actual large scale statistics you may have to substantiate your opinion.


    HDDs have firmware bugs too.. always have done.


    I agree that HDDs die for different reasons than SSDs - both though, follow the same bathtub life-cycle curve which _can_ be modeled the same way for either.


    I have plugged some numbers into a reliability network and I have a result I'm happy with.
    I'm happy to redo if you have any real numbers to challenge.


    I'll keep doing what I'm doing thanks.

  • I'll keep doing what I'm doing

    • https://www.tomshardware.com/n…g-8mb-firmware,13250.html -- this firmware bug affects SSDs losing power (power loss, UPS failure). Try to imagine what happens if you make up an array out of identical SSDs that are all affected by the same bug
    • https://forums.crucial.com/t5/…-once-an-hour/ta-p/130218 -- this firmware bug affects SSDs running more than 5184 hours then becoming unresponsive until next power cycle. Try to imagine what happens if you make up an array out of identical SSDs that are all affected by the same bug
    • Insert random firmware bug here. Try to imagine what happens if you make up an array out of identical SSDs that are all affected by the same bug

    Up to you to think about. The contractors we work with always ensure that HDDs we put into our arrays are from different batches (they learned from two nasty occasions that led to negotiation problems between Infortrend/EMC RAID controllers and disks that led to a bunch of HDDs being kicked out of arrays at the same time). For whatever reasons they don't do the same with the caching SSDs (might be one of the areas where people only learn by having to make their own experiences)

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!