This is not RAID specific, but they usually go together
I came across the articles below about RAID, linux timeouts and drive timeouts setting with SMART:
Udev rules and helper scripts for setting safe disk scterc and drive controller timeout values for mdraid arrays
https://github.com/jonathanunderwood/mdraid-safe-timeouts
Linux Software RAID and drive timeouts
http://strugglers.net/~andy/bl…-raid-and-drive-timeouts/
And this one that looks older..
Many (long) HDD default timeouts cause data loss or corruption (silent controller resets)
https://www.smartmontools.org/ticket/658
I am especially concerned that this will fail when you trying to recover the system from an error, for example when resynchronizing after replacing a disk from a RAID.
Is there anything similar in OMV ?
Is there a way to do this or something similar ?
I have to admit that I don't understand how to do what they propose and it seems to me that you also have to adapt the timeout of drives other than RAID1-6 so that the kernel timeout is higher than SMART timeout.
Any experience with this or any suggestion of how to do it ?
The general idea, as I would:
- The kernel has a timeout of 30 seconds for each disk individually.
- The SMART timeouts (STCERC) must be put on each restart.
- On disks where all the information is redunded, put a low timeout on SMART. Data can be rebuild from others sources. I'm not sure what value to put.
- If part of the information is not redunded, put a high timeout in SMART. Data can not be rebuild from others sources. I'm not sure what value to put.
- If I do not know then to put the value in SMART, or it gives error when putting it, to change the timeout of the kernel to 180 seconds for that disk.
Also it would be very interesting to use a higher value if an array is in degraded mode because their data is no longer redunded.
Thanks in advance.