Hi all
Following your proposals, we have tested the implementation on the OMV test PC ZFS technology and configured a mirror RAID with a daily scheduled task that checks if the RAID is working normally and if it is not the case that notifies us by email which slot is affected by the disk in error (ZFS plugin let us configuring disks by path). Furtermore we had tested ZFS RAID mirrored data access from Debian distro with success !
sample of notification zpool status :
# pool: data
# state: DEGRADED
# status: One or more devices could not be used because the label is missing or
# invalid. Sufficient replicas exist for the pool to continue
# functioning in a degraded state.
# action: Replace the device using 'zpool replace'.
# see: http://zfsonlinux.org/msg/ZFS-8000-4J
# scan: none requested
# config:
#
# NAME STATE READ WRITE CKSUM
# data DEGRADED 0 0 0
# mirror-0 DEGRADED 0 0 0
# pci-0000:00:1f.2-ata-2 ONLINE 0 0 0
# pci-0000:00:1f.2-ata-3 UNAVAIL 0 104 0 corrupted data
#
# errors: No known data errors
Alles anzeigen
What was not mentioned is that we are a small association "loi 1901" with limited resources, that the computer service (if we can call it that ) is composed of 1.5 people who have some knowledge in computer science and that the original request was to replace at a lower cost a Synology DS209J capricious and unstable NAS on which we have little control.
We do not have a real DRP plan because our needs are minimal. In addition, in our installation, the NAS data is saved daily on another server.
In fact in our case what is important is:
- that you can eject a disk in the event of a disaster, that it contains all the important data and allows us to rebuild the NAS by itself
- be alerted if there is a disk error (RAID and / or SMART)
For us, in the notification (mail or OMV HMI) it is not the serial number of the HDD which is important. What is important is rather to know that the disk inserted in the slot identified (or with a DYMOed stick ) 2 is in error and therefore it must be replaced ASAP to preserve the redundancy of the mirror.
The idea of driving the LED racks is interesting but the problem is that we can not really plan and choose the material in a fine way. It's a safe bet that the storage controller of the final PC will not be manageable by ledmon (on the OMV test PC the command ledctl -L returns "ledctl: invalid option - 'L'" for ledmon v0.79 and this option -L does not seem to be documented on the web).
If we understood well ...
Using ZFS mirror RAID technology ensures data integrity because a checksum fingerprint is retained for a set of data, and if the current fingerprint of that set no longer matches the retained fingerprint, then either data or either the fingerprint is wrong and therefore the whole is corrupted. In this case, the mirror makes it possible to recover the data / fingerprint pair of the other disk if they prove to be concordant. We can then replace corrupted data with healthy ones. In this case it means that some of the disks are reserved for fingerprints and therefore we lose a little space available compared to a RAID 1 mdadm.
That's right ? Thank you to confirm if we understood well ?
PS: we could not read the content of the link '(German) overview with SAS backplanes' since we do not speak German (we are french )