How to remove HD from RAID5 for check sectors

    • OMV 4.x
    • How to remove HD from RAID5 for check sectors

      Hi all, I have a configuration with 4x4TB WD disks and one of them present some smart error, I opened an RMA and before sending a new one wd asks me to do a low-level format and verify If the error persists ( you can see smart data attached).

      For first step I powered off my OMV box, removed the sata cable and started again OMV with 3 disks, I expected to see my raid configuration working in degraded state but in RAID management I cannot see my RAID configuration and obviously my SMB share dows not works. Reverted the process everything is good and clean.

      How can I do this check? I would like to do It while my RAID configuration continues to work in degraded state, and when I will replace the disk ( with a new one or mine fixed with low-level format ) restore the 4-disks RAID configuration.

      Thank you
      Images
      • Cattura.JPG

        96.52 kB, 1,382×720, viewed 56 times
    • vatastala wrote:

      For first step I powered off my OMV box, removed the sata cable and started again OMV with 3 disks, I expected to see my raid configuration working in degraded state but in RAID management I cannot see my RAID configuration and obviously my SMB share dows not works. Reverted the process everything is good and clean.
      This is standard behaviour for mdadm (software raid), put simply you have to 'tell' mdadm what to do unlike a hardware raid which has a controller to look after it.

      To remove the drive from the GUI -> Raid Management -> Select the Array -> on the menu select Remove -> in the dialogue box select the drive to remove, click OK. The drive can now be removed from the system and the raid will appear as clean/degraded.

      Effectively what you did makes sense but it would have left the array as inactive after a restart.
      Raid is not a backup! Would you go skydiving without a parachute?
    • geaves wrote:

      vatastala wrote:

      For first step I powered off my OMV box, removed the sata cable and started again OMV with 3 disks, I expected to see my raid configuration working in degraded state but in RAID management I cannot see my RAID configuration and obviously my SMB share dows not works. Reverted the process everything is good and clean.
      This is standard behaviour for mdadm (software raid), put simply you have to 'tell' mdadm what to do unlike a hardware raid which has a controller to look after it.
      To remove the drive from the GUI -> Raid Management -> Select the Array -> on the menu select Remove -> in the dialogue box select the drive to remove, click OK. The drive can now be removed from the system and the raid will appear as clean/degraded.

      Effectively what you did makes sense but it would have left the array as inactive after a restart.
      In fact, what happened scares me ... If a drive breaks one day suddenly, will I never see my RAID again? How will I recover it? I'm also scared to do the operation you say, I have a lot of data on the disks and I don't have any more space to backup...
    • geaves wrote:

      vatastala wrote:

      For first step I powered off my OMV box, removed the sata cable and started again OMV with 3 disks, I expected to see my raid configuration working in degraded state but in RAID management I cannot see my RAID configuration and obviously my SMB share dows not works. Reverted the process everything is good and clean.
      This is standard behaviour for mdadm (software raid), put simply you have to 'tell' mdadm what to do unlike a hardware raid which has a controller to look after it.
      To remove the drive from the GUI -> Raid Management -> Select the Array -> on the menu select Remove -> in the dialogue box select the drive to remove, click OK. The drive can now be removed from the system and the raid will appear as clean/degraded.

      Effectively what you did makes sense but it would have left the array as inactive after a restart.
      In fact, what happened scares me ... If a drive breaks one day suddenly, will I never see my RAID again? How will I recover it? I'm also scared to do the operation you say, I have a lot of data on the disks and I don't have any more space to backup...
    • vatastala wrote:

      In fact, what happened scares me ... If a drive breaks one day suddenly, will I never see my RAID again? How will I recover it?
      Ok, Raid5 will only allow for one drive failure, the fact you are about to return one, and to do that you need to remove it from the array and your server. This leaves the current array in a clean/degraded state another drive failure and your data is gone!! Believe me this can happen and it has happened to me but only in work environment.

      You have five choices,
      1. Switch to Raid6 if you insist on using a Raid option, this will allow for two drives to fail but reduces the space available.
      2. Initiate a backup either USB or another internal and backup data you don't want to lose.
      3. Use something like MergerFS and Snapraid, Snapraid uses a drive for parity, search the forum there is plenty of information.
      4. Use ZFS this is a more fault tolerant, but having used it previously it's something I don't want to revisit.
      5. You have four drives, you could use two for data and two running rsync, rsync can run after hours as scheduled job, this is sort of Raid1 and I use sort of loosely. What rsync will do is sync the data drive to the second giving you your data in two places, the working drive and the backup drive, so if one fails you get another drive and sync it back.

      Most home users implement a Raid because they believe it's 'the thing to do' as most hardware NAS solutions do this, but there is big difference with those, they implement a Raid controller with software but it's specific to that NAS.

      Software raid (mdadm) does not do that it is in itself software and a user needs to understand not just how to set it up but how to recover from any given problem, most of which has to be done from the command line, this again is something some users are not comfortable with.

      My sig tells a story in itself most users assume their is safe on a raid there are users on here using large raid arrays but they back it up because their experience tells them that that the raid could fail. I for one moved from Raid to MergerFS and Snapraid but I also have a drive that runs rsync to back up the stuff I don't want to lose, something that some users for get to do.

      You read some users experiences on here and someone like @Adoby uses HC2's with a large drive, but one backs up another, it's a different approach, just because you want a NAS does not mean you need a box full of hard drives running a raid setup.

      Sorry, rant over :D
      Raid is not a backup! Would you go skydiving without a parachute?
    • geaves wrote:

      vatastala wrote:

      For first step I powered off my OMV box, removed the sata cable and started again OMV with 3 disks, I expected to see my raid configuration working in degraded state but in RAID management I cannot see my RAID configuration and obviously my SMB share dows not works. Reverted the process everything is good and clean.
      This is standard behaviour for mdadm (software raid), put simply you have to 'tell' mdadm what to do unlike a hardware raid which has a controller to look after it.
      To remove the drive from the GUI -> Raid Management -> Select the Array -> on the menu select Remove -> in the dialogue box select the drive to remove, click OK. The drive can now be removed from the system and the raid will appear as clean/degraded.

      Effectively what you did makes sense but it would have left the array as inactive after a restart.
      In fact, what happened scares me ... If a drive breaks one day suddenly, will I never see my RAID again? How will I recover it? I'm also scared to do the operation you say, I have a lot of data on the disks and I don't have any more space to backup...

      geaves wrote:

      vatastala wrote:

      In fact, what happened scares me ... If a drive breaks one day suddenly, will I never see my RAID again? How will I recover it?
      Ok, Raid5 will only allow for one drive failure, the fact you are about to return one, and to do that you need to remove it from the array and your server. This leaves the current array in a clean/degraded state another drive failure and your data is gone!! Believe me this can happen and it has happened to me but only in work environment.
      You have five choices,
      1. Switch to Raid6 if you insist on using a Raid option, this will allow for two drives to fail but reduces the space available.
      2. Initiate a backup either USB or another internal and backup data you don't want to lose.
      3. Use something like MergerFS and Snapraid, Snapraid uses a drive for parity, search the forum there is plenty of information.
      4. Use ZFS this is a more fault tolerant, but having used it previously it's something I don't want to revisit.
      5. You have four drives, you could use two for data and two running rsync, rsync can run after hours as scheduled job, this is sort of Raid1 and I use sort of loosely. What rsync will do is sync the data drive to the second giving you your data in two places, the working drive and the backup drive, so if one fails you get another drive and sync it back.

      Most home users implement a Raid because they believe it's 'the thing to do' as most hardware NAS solutions do this, but there is big difference with those, they implement a Raid controller with software but it's specific to that NAS.

      Software raid (mdadm) does not do that it is in itself software and a user needs to understand not just how to set it up but how to recover from any given problem, most of which has to be done from the command line, this again is something some users are not comfortable with.

      My sig tells a story in itself most users assume their is safe on a raid there are users on here using large raid arrays but they back it up because their experience tells them that that the raid could fail. I for one moved from Raid to MergerFS and Snapraid but I also have a drive that runs rsync to back up the stuff I don't want to lose, something that some users for get to do.

      You read some users experiences on here and someone like @Adoby uses HC2's with a large drive, but one backs up another, it's a different approach, just because you want a NAS does not mean you need a box full of hard drives running a raid setup.

      Sorry, rant over :D
      I understand everything and totally agree, but for the moment I can take the "risk" because my disks are really new, for sure another one can die but I will not use the remaining 3 disks to much till my new one come back, I'll take them powered off so for the moment It's ok...

      So, regarding my question, do you think I can go ahead with the operation you suggested?
    • By using RAID you combine the capacity of several drives and add a number of redundant drives. This is nice and very easy to do. The redundancy may allow you to continue to access the files on the RAID if no more than the number of redundant drive(s) fail. But if more than the redundant number of drive(s) fail, you loose EVERYTHING! Unless you have good backups.

      And typically when a drive in a RAID fail you replace it and rebuild the array. Or you desperately start to backup the files, to save them. This means a lot of extra work for the remaining drives. And if one or more of the remaining drives are old and also close to failing, this is when they are the most likely to fail. And you loose EVERYTHING. This problem is why RAID5 with modern big drives is a bad idea. It is likely that more drives fail while you try to rebuild or backup the broken RAID. RAID6 might help a bit.

      So if you use RAID it is MORE important to have good backups. Not less. The direct opposite of what some seem to believe.

      This is why I don't use RAID. Instead I make sure I have good backups, in several generations and some even at different locations.

      Typically you don't need backups for everything. Just for the files you really don't want to loose.
      OMV 4: 7 x Odroid HC2 + 1 x HC1 + 2 x RPi4
    • geaves wrote:

      vatastala wrote:

      So, regarding my question, do you think I can go ahead with the operation you suggested?
      To remove the drive using the GUI, yes, the GUI allows you to remove and add a drive even grow it.
      Ok, so I'm going to remove It, verify that everything is ok and my data is accessible in samba even if the pool is in degraded state, and switch off the box until I'll have a new disk. At that point, I'll add It to the pool and rebuild the array.

      Thank you very much
    • vatastala wrote:

      geaves wrote:

      vatastala wrote:

      So, regarding my question, do you think I can go ahead with the operation you suggested?
      To remove the drive using the GUI, yes, the GUI allows you to remove and add a drive even grow it.
      Ok, so I'm going to remove It, verify that everything is ok and my data is accessible in samba even if the pool is in degraded state, and switch off the box until I'll have a new disk. At that point, I'll add It to the pool and rebuild the array.
      Thank you very much
      Before doing that, I tried to remove another drive and the behavior is the same, everything disappear in the GUI...I really don't understand why, I mean I expect to see my pool degraded...If in the future one drive will fail, I'll never see my degraded pool? I'll lose my data? Very strange...
    • vatastala wrote:

      Before doing that, I tried to remove another drive and the behavior is the same, everything disappear in the GUI...I really don't understand why,
      Re read post 2, mdadm (software raid) does not behave like this you cannot simply 'pull' a drive, basically it has no intelligence it requires input, unlike a hardware raid that has an 'intelligent' controller
      Raid is not a backup! Would you go skydiving without a parachute?
    • geaves wrote:

      vatastala wrote:

      Before doing that, I tried to remove another drive and the behavior is the same, everything disappear in the GUI...I really don't understand why,
      Re read post 2, mdadm (software raid) does not behave like this you cannot simply 'pull' a drive, basically it has no intelligence it requires input, unlike a hardware raid that has an 'intelligent' controller
      Ok so you mean that one day, when a drive will fail, I will input something and I'll see my pool degraded?
    • geaves wrote:

      vatastala wrote:

      Ok so you mean that one day, when a drive will fail, I will input something and I'll see my pool degraded?
      If a drive fails on it's own the array will show clean/degraded, if you 'pull' a drive and reboot the array will not show in the GUI that's because it comes back up as inactive.
      Ok perfect It worked, on boot I have to click ctrl+d for the default conf and I seem It degraded.

      Thank you very much for the help and clarifications :)