RAID Disappeared - need help to rebuild

    • Sc0rp wrote:

      Re,

      this looks bad - have any drive SATA-errors too? From plain data given, i would "sdf" change first, then "sdc", then "sdb" ... and may be use anonther vendor/type of harddrives ... btw. which drives do you use actual?

      Sc0rp
      Thanks for the guidance. Can't check for SATA errors right now (none that I'm aware of), as the server is off at home (figured it would be best to switch it off until I could start drive replacements). They're all Seagate Barracuda 3TB drives. I've already replaced one of the others previously, and I have a new spare ready and waiting. I was wondering whether I'd be better off switching to a different model, for the next swaps. I'd read that Barracuda weren't intended/good for always on servers, and had seen Seagate IronWolf drives recommended. Any thoughts on that?
    • Re,

      IronWolfs are relativly "young" drives, but it seems currently, that they are worth a try (but they operate at 7,2k rpm ... more performance for more temp!) ... WD Red are the usual/common recommendation for NAS-setups - they are realy good suited for private/home office use (you can change to WD Red Pro if you need more performance ...).

      Sc0rp
    • Hey guys ... work's been crazy this week, and I've only just got round to trying to progress this. The first thing I did was start backing up some of the files that I knew would be a ball-ache to replace. All was going well for a good hour or so, and then a read error was reported and the NAS disappeared from the network (I assume it was unmounted). I rebooted the NAS, and was struggling to even get OMV to boot. Next, I removed all drives from the NAS, and OMV booted no issue. So, I added the drives back in and swapped out what was sdb (the drive with the increasingly high number of pending/offline uncorrectable sectors). I'm still waiting for another 2 drives to replace sdf and sdc (in that order).

      After swapping out sdb, repairing the degraded RAID seemed to go fine, and it completed with no reported issues. The RAID was mounted, and all folders/files were visible to the network again. However, I went to delete some files, and immediately faced a read error (Windows dialogue: Location is not available. W:\ is not accessible. The request could not be performed because on an I/O error).

      The OMV GUI is still showing the RAID as active/clean/mounted. mdadm --examine /dev/sd[abcdef] reports:


      Source Code

      1. /dev/sda:
      2. Magic : a92b4efc
      3. Version : 1.2
      4. Feature Map : 0x0
      5. Array UUID : 6ff00f35:b3aa6d29:25ac1d7e:2a2b2007
      6. Name : openmediavault:MEDIAVAULT (local to host openmediavault)
      7. Creation Time : Mon Sep 17 01:03:50 2012
      8. Raid Level : raid5
      9. Raid Devices : 6
      10. Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
      11. Array Size : 14651325440 (13972.59 GiB 15002.96 GB)
      12. Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
      13. Data Offset : 2048 sectors
      14. Super Offset : 8 sectors
      15. State : clean
      16. Device UUID : 7a6cb4e9:902f3852:3c8a0119:6f74dcea
      17. Update Time : Sat Dec 2 07:20:36 2017
      18. Checksum : 2b916fbb - correct
      19. Events : 289173
      20. Layout : left-symmetric
      21. Chunk Size : 512K
      22. Device Role : Active device 5
      23. Array State : AAAAAA ('A' == active, '.' == missing)
      24. /dev/sdb:
      25. Magic : a92b4efc
      26. Version : 1.2
      27. Feature Map : 0x0
      28. Array UUID : 6ff00f35:b3aa6d29:25ac1d7e:2a2b2007
      29. Name : openmediavault:MEDIAVAULT (local to host openmediavault)
      30. Creation Time : Mon Sep 17 01:03:50 2012
      31. Raid Level : raid5
      32. Raid Devices : 6
      33. Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
      34. Array Size : 14651325440 (13972.59 GiB 15002.96 GB)
      35. Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
      36. Data Offset : 2048 sectors
      37. Super Offset : 8 sectors
      38. State : clean
      39. Device UUID : 2ec934c5:f798aa47:7218e531:9d493298
      40. Update Time : Sat Dec 2 07:20:36 2017
      41. Checksum : c2bd99cb - correct
      42. Events : 289173
      43. Layout : left-symmetric
      44. Chunk Size : 512K
      45. Device Role : Active device 4
      46. Array State : AAAAAA ('A' == active, '.' == missing)
      47. /dev/sdc:
      48. Magic : a92b4efc
      49. Version : 1.2
      50. Feature Map : 0x0
      51. Array UUID : 6ff00f35:b3aa6d29:25ac1d7e:2a2b2007
      52. Name : openmediavault:MEDIAVAULT (local to host openmediavault)
      53. Creation Time : Mon Sep 17 01:03:50 2012
      54. Raid Level : raid5
      55. Raid Devices : 6
      56. Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
      57. Array Size : 14651325440 (13972.59 GiB 15002.96 GB)
      58. Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
      59. Data Offset : 2048 sectors
      60. Super Offset : 8 sectors
      61. State : clean
      62. Device UUID : 5fdfd228:cccdf51e:0ee5e8e6:eaf3d87c
      63. Update Time : Sat Dec 2 07:20:36 2017
      64. Checksum : 97515b29 - correct
      65. Events : 289173
      66. Layout : left-symmetric
      67. Chunk Size : 512K
      68. Device Role : Active device 0
      69. Array State : AAAAAA ('A' == active, '.' == missing)
      70. /dev/sdd:
      71. Magic : a92b4efc
      72. Version : 1.2
      73. Feature Map : 0x0
      74. Array UUID : 6ff00f35:b3aa6d29:25ac1d7e:2a2b2007
      75. Name : openmediavault:MEDIAVAULT (local to host openmediavault)
      76. Creation Time : Mon Sep 17 01:03:50 2012
      77. Raid Level : raid5
      78. Raid Devices : 6
      79. Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
      80. Array Size : 14651325440 (13972.59 GiB 15002.96 GB)
      81. Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
      82. Data Offset : 2048 sectors
      83. Super Offset : 8 sectors
      84. State : clean
      85. Device UUID : ef7151df:f61fce6e:16124dbf:c0e9d1cf
      86. Update Time : Sat Dec 2 07:20:36 2017
      87. Checksum : c905634c - correct
      88. Events : 289173
      89. Layout : left-symmetric
      90. Chunk Size : 512K
      91. Device Role : Active device 1
      92. Array State : AAAAAA ('A' == active, '.' == missing)
      93. /dev/sde:
      94. Magic : a92b4efc
      95. Version : 1.2
      96. Feature Map : 0x0
      97. Array UUID : 6ff00f35:b3aa6d29:25ac1d7e:2a2b2007
      98. Name : openmediavault:MEDIAVAULT (local to host openmediavault)
      99. Creation Time : Mon Sep 17 01:03:50 2012
      100. Raid Level : raid5
      101. Raid Devices : 6
      102. Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
      103. Array Size : 14651325440 (13972.59 GiB 15002.96 GB)
      104. Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
      105. Data Offset : 2048 sectors
      106. Super Offset : 8 sectors
      107. State : clean
      108. Device UUID : 1e70980d:77996a00:97862be2:e6522558
      109. Update Time : Sat Dec 2 07:20:36 2017
      110. Checksum : 341aba6b - correct
      111. Events : 289173
      112. Layout : left-symmetric
      113. Chunk Size : 512K
      114. Device Role : Active device 2
      115. Array State : AAAAAA ('A' == active, '.' == missing)
      116. /dev/sdf:
      117. Magic : a92b4efc
      118. Version : 1.2
      119. Feature Map : 0x0
      120. Array UUID : 6ff00f35:b3aa6d29:25ac1d7e:2a2b2007
      121. Name : openmediavault:MEDIAVAULT (local to host openmediavault)
      122. Creation Time : Mon Sep 17 01:03:50 2012
      123. Raid Level : raid5
      124. Raid Devices : 6
      125. Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
      126. Array Size : 14651325440 (13972.59 GiB 15002.96 GB)
      127. Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
      128. Data Offset : 2048 sectors
      129. Super Offset : 8 sectors
      130. State : clean
      131. Device UUID : 55ad5625:0b737d46:248eaedc:cfc05460
      132. Update Time : Sat Dec 2 07:20:36 2017
      133. Checksum : 949e4b25 - correct
      134. Events : 289173
      135. Layout : left-symmetric
      136. Chunk Size : 512K
      137. Device Role : Active device 3
      138. Array State : AAAAAA ('A' == active, '.' == missing)
      Display All
      Nothing jumps out as being odd, and all details seem to match from one HDD to the next.

      I've pasted updated logs and messages here:

      sprunge.us/UCLE
      sprunge.us/PWVK

      I'm really not sure what to do next (aside from swapping out the other 2 HDDs that are showing SMART errors. I did notice something in the logs about xfs_repair - should I be attempting this now (and basic question, is it run at HDD level, or at RAID/device level?).

      As always, any insight and recommendations would be great. Thanks in advance,
      Brian
    • Hi Sc0rp, Appreciate your assistance, especially when you're so busy. I've done as suggested, but the NAS seems to hang on reboot and the array isn't unmounted. I connected a monitor directly to the device so I can see system messages. The following look relevant:

      Turning off quotas ... quotaoff: Cannot resolve mountpoint path /media/folder ID: Input/output error (the same is repeated for several shared folders) ... Unmounting local filesystems ... [213192.927608] INFO task umount:3569 blocked for more than 120 seconds. [213912.932051] Not tainted 3.16.0-0.bpo.4-amd64 #1 [213192.936445] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message

      The same 3 messages are now repeating every 120 secs (with different numeric values at the start of the message). What now? Leave it running? Force shutdown?
    • I've managed to do as you suggested now (not sure how I got there, but fsck ran). Also ran xfs_repair -n, which comes back with loads of issues, and eventually: Inode allocation btrees are too corrupted, skipping phases 6 and 7
      No modify flag set, skipping filesystem flush and exiting.

      I guess if the RAID has been rebuilt after disks had dropped out there's going to be inconsistencies all over the place.

      Will swap out the other 2 discs when they arrive, and then maybe run xfs-repair and see what's salvageable but I'm thinking this might just have to be given up as a bad job, and rebuild my media collection from scratch (and of course, create a back-up next time!).
    • Re,

      brifletch wrote:

      I guess if the RAID has been rebuilt after disks had dropped out there's going to be inconsistencies all over the place.
      From my expirience: the inconsistences occured WHILE the drive was dying ... but the result remains the same ...

      brifletch wrote:

      Will swap out the other 2 discs when they arrive, and then maybe run xfs-repair and see what's salvageable but I'm thinking this might just have to be given up as a bad job, and rebuild my media collection from scratch (and of course, create a back-up next time!).
      Yeah, may be you can finally "force" xfs_repair to get a clean state (at least you can zero the journal ...) just search the inet for "man xfs_repair" :D

      Btw. i had also many problems while using xfs over an old areca-hw-raidcontroller, due to a bug in the driver (kernel-module), but i never lost data ...

      If you have ever the chance to make your current array new, consider using ZFS-Z1 or ZFS-Z2 instead, it's more convinient nowadays ... and as a special benefit 4 me, @tkaiser is then in charge :P ... uhm, just kidding ... a bit.

      Sc0rp
    • Users Online 2

      1 Member and 1 Guest