Raid 10 fails every time (2 devices fail)

    • OMV 4.x

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • Raid 10 fails every time (2 devices fail)

      Hello,
      I've just been trying to setup this omv NAS for about a week now. I gave up on raid 10 because this happened in the first place but then got paranoid about redundancy again so now I am trying again and need help.

      The issue:
      When setting up my raid for my 4x 2tb hard drives (well one is actually a 3) I wanted to go with Raid 10. Upon creating the raid it never seems to finish resyncing and upon creating a file system the first 2 drives will fail and the raid will become (clean, degraded). This only seems to happen on raid 10 because upon setting up linear for the last few days I've been able to write and read from the server with no problems. Upon creation of the raid 10 it displays (PENDING). Upon changing monitors to check the server terminal I see that 2 drives have supposedly failed, these devices being sda and sdc, and it will attempt to continue on two devices. Restarting the server gives the same error message and nothing I seem to do fixes this. So this is where I stand.

      I'm new here so if you want any files or settings logs I will attempt to provide them.



      EDIT:

      Source Code

      1. cat /proc/mdstat
      2. Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [ra id10]
      3. md0 : active raid1 dm-1[1] dm-0[0]
      4. 3906895872 blocks super 1.2 [2/2] [UU]
      5. [>....................] resync = 2.5% (97925120/3906895872) finish=701.0 min speed=90553K/sec
      6. bitmap: 30/30 pages [120KB], 65536KB chunk
      7. unused devices: <none>
      8. root@WymanStorage:~# blkid
      9. /dev/mmcblk1p1: UUID="5bab0a55-56f1-4443-8cac-297e1181425c" TYPE="ext4" PARTUUID="01bb5fb4-01"
      10. /dev/mmcblk1p2: UUID="d0da7bbe-e3af-4588-8715-aa5c4478eb88" UUID_SUB="8d229fcf-d05b-48da-8f0b-b3211c04308d" TYPE="btrfs" PARTUUID="01bb5fb4-02"
      11. /dev/zram0: UUID="0451ae98-459a-4591-97c7-478ce5d88aab" TYPE="swap"
      12. /dev/zram1: UUID="a8896e2a-15bf-4e3c-bef1-8c7470314f55" TYPE="swap"
      13. /dev/zram2: UUID="dc9b24b9-237d-41b6-a91a-8ddc221c3faa" TYPE="swap"
      14. /dev/zram3: UUID="35f0b903-f653-48a8-81b1-9e101182a0b0" TYPE="swap"
      15. /dev/zram4: UUID="27ea442a-e1b8-4c0f-837d-28af81389ceb" TYPE="swap"
      16. /dev/zram5: UUID="97be9180-b8fc-465c-984d-138eacb96da5" TYPE="swap"
      17. /dev/zram6: UUID="ff0e0042-7140-4ea0-8881-42a82b7d28cc" TYPE="swap"
      18. /dev/zram7: UUID="66e109ed-5f8e-4ff0-a868-e650699a7543" TYPE="swap"
      19. /dev/sde: UUID="3KhV0T-z2eP-tdd0-q4UJ-enGb-ANCV-B1Qt7O" TYPE="LVM2_member"
      20. /dev/sdf: UUID="H7GAdO-1tmO-DU42-nYYa-010n-T0tV-ufhFAL" TYPE="LVM2_member"
      21. /dev/sdg: UUID="UZRsJT-mnfz-By9J-cYb3-v1Hd-R0Hq-y3C91H" TYPE="LVM2_member"
      22. /dev/sdh: UUID="z7UXnh-dLoT-7RX1-Hiu7-3oes-NrHE-waVgJ5" TYPE="LVM2_member"
      23. /dev/mapper/Strip2-Shrug2: UUID="f03104d1-8e61-e93f-8ee0-3854ef1a9d0b" UUID_SUB="990744ce-f6af-3eb0-0301-11ee9dfc893b" L ABEL="WymanStorage:ttttt" TYPE="linux_raid_member"
      24. /dev/mapper/Strip1-shrug1: UUID="f03104d1-8e61-e93f-8ee0-3854ef1a9d0b" UUID_SUB="707db9c1-a523-bcb5-4c04-471f103bae75" L ABEL="WymanStorage:ttttt" TYPE="linux_raid_member"
      25. /dev/mmcblk1: PTUUID="01bb5fb4" PTTYPE="dos"
      26. /dev/mmcblk1p3: PARTUUID="01bb5fb4-03"
      27. root@WymanStorage:~# fdisk -l | grep "disk "
      28. root@WymanStorage:~# cat /etc/mdadm/mdadm.conf
      29. # mdadm.conf
      30. #
      31. # Please refer to mdadm.conf(5) for information about this file.
      32. #
      33. # by default, scan all partitions (/proc/partitions) for MD superblocks.
      34. # alternatively, specify devices to scan, using wildcards if desired.
      35. # Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.
      36. # To avoid the auto-assembly of RAID devices a pattern that CAN'T match is
      37. # used if no RAID devices are configured.
      38. DEVICE partitions
      39. # auto-create devices with Debian standard permissions
      40. CREATE owner=root group=disk mode=0660 auto=yes
      41. # automatically tag new arrays as belonging to the local system
      42. HOMEHOST <system>
      43. # definitions of existing MD arrays
      44. ARRAY /dev/md0 metadata=1.2 name=WymanStorage:ttttt UUID=f03104d1:8e61e93f:8ee03854:ef1a9d0b
      45. root@WymanStorage:~# cat /etc/mdadm/mdadm.conf5
      46. cat: /etc/mdadm/mdadm.conf5: No such file or directory
      47. root@WymanStorage:~# cat /etc/mdadm/mdadm.conf(5)
      48. -bash: syntax error near unexpected token `('
      49. root@WymanStorage:~# cat /etc/mdadm/mdadm.conf
      50. # mdadm.conf
      51. #
      52. # Please refer to mdadm.conf(5) for information about this file.
      53. #
      54. # by default, scan all partitions (/proc/partitions) for MD superblocks.
      55. # alternatively, specify devices to scan, using wildcards if desired.
      56. # Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.
      57. # To avoid the auto-assembly of RAID devices a pattern that CAN'T match is
      58. # used if no RAID devices are configured.
      59. DEVICE partitions
      60. # auto-create devices with Debian standard permissions
      61. CREATE owner=root group=disk mode=0660 auto=yes
      62. # automatically tag new arrays as belonging to the local system
      63. HOMEHOST <system>
      64. # definitions of existing MD arrays
      65. ARRAY /dev/md0 metadata=1.2 name=WymanStorage:ttttt UUID=f03104d1:8e61e93f:8ee03854:ef1a9d0b
      66. root@WymanStorage:~# mdadm --detail --scan --verbose
      67. ARRAY /dev/md0 level=raid1 num-devices=2 metadata=1.2 name=WymanStorage:ttttt UUID=f03104d1:8e61e93f:8ee03854:ef1a9d0b
      68. devices=/dev/dm-0,/dev/dm-1
      Display All

      The post was edited 3 times, last by mawyman2316 ().

    • mawyman2316 wrote:

      Upon creating the raid it never seems to finish resyncing and upon creating a file system the first 2 drives will fail and the raid will become (clean, degraded).
      First, the array must finish the initial sync before putting a file system on it. I ran a RAID10 sync on tiny little 5Gb virtual drives. That took about 5 minutes. Scaling this up to 2TB drives means your sync might take 6 hours. (With all the variables, that's a very rough estimate.) When the sync is finished, then you'd put a file system on the array.



      Second, if these drives are not new, have you looked at their SMART stat's? If one or more have failing stat's, mdadm RAID may kick them out of the array. Under Storage, SMART, enable the SMART service, and edit each of your devices and enable SMART on them. In attributes look for the following:

      SMART 5 – Reallocated_Sector_Count.
      SMART 187 – Reported_Uncorrectable_Errors.
      SMART 188 – Command_Timeout.
      SMART 197 – Current_Pending_Sector_Count.
      SMART 198 – Offline_Uncorrectable.
      SMART 199 - UltraDMA CRC errors

      With the exception of 199, errors in the above attributes are an indicator that a drive failure may be in the making and might be a reason why the array is kicking member drives out. 199 is usually a cable/connector issue but it causes I/O errors.
      ________________________________________________________________________________

      mawyman2316 wrote:

      but then got paranoid about redundancy
      Paranoia is a good thing - it can help you keep your data but using an effective backup method is important. You do know that RAID is not about "redundancy", it's about "availability", and I can't think of a good reason to run RAID1 (or a variant, RAID10) at home.

      Provided that they're healthy, you have enough disks to achieve real backup. Solid backup is far better than the false feeling of protection one may think they're getting from RAID1.

      Video Guides :!: New User Guide :!: Docker Guides :!: Pi-hole in Docker
      Good backup takes the "drama" out of computing.
      ____________________________________
      Primary: OMV 3.0.99, ThinkServer TS140, 12GB ECC, 32GB USB boot, 4TB+4TB zmirror, 3TB client backup.
      OMV 4.1.13, Intel Server SC5650HCBRP, 32GB ECC, 16GB USB boot, UnionFS+SNAPRAID
      Backup: OMV 4.1.9, Acer RC-111, 4GB, 32GB USB boot, 3TB+3TB zmirror, 4TB Rsync'ed disk

      The post was edited 1 time, last by flmaxey: edit ().

    • It could be any number of things. I have been running a nas at my house for 10 years now.

      What I have found through trial and error and a lot of pulling my hair out;

      1) I mad the mistake of buying the cheapest sata cables and cursed my hard drives for being bad. I changed the cables out to better made ones and the drives worked flawlessly. Even new cables can be bad as it appears no one does quality control anymore to check products work out the door.

      2) Drives, even new ones, can have bad sectors. Usually your manufacturer will have software that can be downloaded from their
      website that will allow you to check the drives for bad sectors and 0 out any data that may be on them. This is a time consuming process.

      3) Bad memory. I bought more expensive ECC memory that my mb manufacturer recommended. I made the mistake of throwing in memory I had laying around. It initially appeared to run fine, but it was causing data rot and corruption. If you do use non ecc memory or something you have laying around, run it through your bios memory checker if available and software made to run memory through its paces.

      Joe

      The post was edited 2 times, last by reverendspam ().