A way to know which disk replace on a bay ?

  • Hello everyone :)


    I am building an OMV NAS with a RAID 1 on 2 SATA disks in internal racks to be able to insert / eject in hotplug / hotswap (like this :https://i2.cdscdn.com/pdt2/6/5…e-interne-pour-disque.jpg).


    I would like to find a solution so that in case of breakdown I know which disk to replace. It's not so much for me as for non-computer people.


    The trouble is that on the web server the failure indicates that /dev/sdA is operational (so by deduction /dev/sdB is down) but does not say if it concerns rack 1 or rack 2.


    I have browsed a lot of posts on both OMV and mdadm without success (symlinks with udev, configuration of /etc/mdadm/mdadm.conf, etc.). Systematically the only device reassembled is /dev/sd?


    A solution that works ?


    Regards,
    lnj

  • A solution that works ?


    Using /dev/disk/by-id for example: https://www.hellion.org.uk/blo…hotswap-failed-raid-disk/


    Those device nodes should contain the drive's serial (e.g. in /dev/disk/by-id/ata-ST3000DM001-1CH166_W1F2QSV6 it's W1F2QSV6) so you can label your drive bays accordingly.


    BTW: mdraid-1 is IMO a really horrible idea :)

  • Hi tkaiser and thanks you for your quick response :)


    Nope :( ! Even if I create a RAID with explicit /dev/disk/by-id these are NOT preserved by OMV web server and notifications. I'm starting to think that there is no solution to my needs and I'm starting to consider changing the display code of the storage topic to reflect something other than the arbitrary device name /dev/sd? which will possibly change between restarts. Another suggestion or I missed your solution ?

    Mmmm... so I tested it and I discovered that my use case of hot-unplug (in case of calamity) worked too. One disk can contain all data by itself.
    I tried :

    Bash
    mdadm --create --verbose /dev/md0 --level=10 --raid-devices=2 /dev/disk/by-id/ata-TOSHIBA_MQ04ABF100_Y7R8PG66T /dev/disk/by-id/ata-ST1000DM003-9YN162_S1D43B7E --layout=f2 --size=2G

    => this works like a charm (even if I hot-unplug a disk and read it from another machine), so thank you
    BUT I do not understand the subtleties of how RAID 1 is bad idea compared to RAID 10 f2. If I understand the same data are written. No parity data. No more data. Just the same data doubled but organized differently. Can you explain me more?

  • Even if I create a RAID with explicit /dev/disk/by-id these are NOT preserved by OMV web server and notifications.

    OMV is not involved here, it's the mdraid subsystem. And even if this uses those stupid device identifiers by looking into /dev/disk/by-id you yourself can get what device is which (by reading through hellion.org.uk/blog/posts/hotswap-failed-raid-disk/ again and using the readlink method for example).


    Of course you could use less stupid ways to waste one disk for almost nothing (mdraid's RAID1) mode. With a zmirror and ZFS for example in case of a failure you get an email like this:


    So you know the drive with serial YBE1NR2M is in trouble.



    I do not understand the subtleties of how RAID 1 is bad idea compared to RAID 10 f2

    That's not the point I made in mdraid-1 is IMO a really horrible idea. It's about how useless mdraid's RAID1 is compared to modern approaches like a zmirror or btrfs' RAID1.

  • BTW: I use the following or similar scripts at customers (called from /etc/rc.localto get a clean list of slot positions.


    But no idea whether it works with OMV4 out of the box since the stuff is kernel dependent and made for Ubuntu. So in case you want to give it a try with OMV be sure to enable/use the Proxmox kernel (and if you're using the Proxmox kernel think about whether you better use a great zmirror instead of a lame mdraid1)


  • @lenainjaune: as @ness1602 suggested trying to blink slots (not HDDs) is a nice idea. To check whether it works in general you would install the ledmon package and then look for supported controllers (HBAs, backplanes/enclosure combos): ledctl -L. If nothing shows up it can't be used, if something shows up then good luck.


    I know colleagues having lost arrays due to bad slot management (simple stuff like referencing the slots on the outside from 1-8 while software internally used 0-7, or wrong jumper settings on the backplane inside the enclosure -- see here for a (German) overview with SAS backplanes) that's why I prefer referencing drives by serial number and not slot number.

  • I put a DYMO stick with Serial Number on a visible area f dock, so I know the SN of the disk and can compare with SN on webGUI.

  • Hi all :)


    Following your proposals, we have tested the implementation on the OMV test PC ZFS technology and configured a mirror RAID with a daily scheduled task that checks if the RAID is working normally and if it is not the case that notifies us by email which slot is affected by the disk in error (ZFS plugin let us configuring disks by path). Furtermore we had tested ZFS RAID mirrored data access from Debian distro with success !



    sample of notification zpool status :


    What was not mentioned is that we are a small association "loi 1901" with limited resources, that the computer service (if we can call it that :) ) is composed of 1.5 people who have some knowledge in computer science and that the original request was to replace at a lower cost a Synology DS209J capricious and unstable NAS on which we have little control.


    We do not have a real DRP plan because our needs are minimal. In addition, in our installation, the NAS data is saved daily on another server.


    In fact in our case what is important is:
    - that you can eject a disk in the event of a disaster, that it contains all the important data and allows us to rebuild the NAS by itself
    - be alerted if there is a disk error (RAID and / or SMART)


    For us, in the notification (mail or OMV HMI) it is not the serial number of the HDD which is important. What is important is rather to know that the disk inserted in the slot identified (or with a DYMOed stick :) ) 2 is in error and therefore it must be replaced ASAP to preserve the redundancy of the mirror.


    The idea of driving the LED racks is interesting but the problem is that we can not really plan and choose the material in a fine way. It's a safe bet that the storage controller of the final PC will not be manageable by ledmon (on the OMV test PC the command ledctl -L returns "ledctl: invalid option - 'L'" for ledmon v0.79 and this option -L does not seem to be documented on the web).


    If we understood well ...


    Using ZFS mirror RAID technology ensures data integrity because a checksum fingerprint is retained for a set of data, and if the current fingerprint of that set no longer matches the retained fingerprint, then either data or either the fingerprint is wrong and therefore the whole is corrupted. In this case, the mirror makes it possible to recover the data / fingerprint pair of the other disk if they prove to be concordant. We can then replace corrupted data with healthy ones. In this case it means that some of the disks are reserved for fingerprints and therefore we lose a little space available compared to a RAID 1 mdadm.


    That's right ? Thank you to confirm if we understood well ?


    PS: we could not read the content of the link '(German) overview with SAS backplanes' since we do not speak German (we are french )

  • We can then replace corrupted data with healthy ones

    It's not exactly you who is replacing corrupted data but ZFS itself when running a scrub. Everything else is (almost) correct. You don't loose storage space with a zmirror since the checksums are stored rather efficiently and you can (and should) use compression=lz4 for the pool (lz4 is fast and only applies compression if the data blocks seem to be compressible).


    Sorry for the German link so please consult http://translate.google.fr for this (but it's to no avail anyway since I doubt you are running with SAS backplanes/enclosures).


    BTW: I would not check the zmirror daily but weekly or even monthly instead. Concerns exist that frequent reading of data from mechanical HDD can lead to faster data degradation so in the end the events you try to spot and correct (silent bit rot) happen faster than necessary (never checked anything here myself -- we simply scrub zpools every 1st Saturday each month)

  • It's not exactly you who is replacing corrupted data but ZFS itself when running a scrub

    Ooops ! Error of translation ! In fact I had already understood that :)


    If it is possible ... 2 last questions :
    - We have read that deduplication is not a good idea for both performance and excessive memory requirements. At present the NAS hosts about 2 million files of varying sizes. Your opinion?
    - In my todo list, I indicated that the ashift parameter can increase performance. I do not know how to determine the ideal value. What to do ?

  • We have read that deduplication is not a good idea for both performance and excessive memory requirements. At present the NAS hosts about 2 million files of varying sizes. Your opinion?

    You need ~320 bytes of RAM per block in a pool with dedup on. Count of blocks depends on which recordsize you've set (smaller recordsize results in better dedup ratio but requires more RAM): https://superuser.com/question…eduped-compressed/1169159 -- if you want to try deduplication you need to figure out whether it's worth the efforts and resources. After some tests we decided to drop dedup on almost all filers due to RAM requirements getting too high


    In my todo list, I indicated that the ashift parameter can increase performance. I do not know how to determine the ideal value. What to do ?

    Staying with the default: 12.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!