Issue system drive

    • OMV 1.0
    • Resolved

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • Issue system drive

      Hello,

      I'm new user since V1.0 of OMV and a quiet reader of the forum. I'm here today because i need some help.

      My conf:
      • 1 HDD for system
      • 5 HDD for data in raid 6
      • OMV v1.11

      Since the past few days, I have an issue with my system drive. I don't know why because it worked for 3 months without issues.

      Sometimes, it "disappears" and I lose control of my whole system. If I reboot, the drive is not detect by the system.
      After checking the power and SATA cables, it's OK but I had a really hard time to have my system drive detected (several reboot needed).

      3 days ago, I had this message and shortly after my drive was unavailable:

      Source Code

      1. Device: /dev/disk/by-id/wwn-0x50014ee2b55806bc [SAT], Read SMART Self-Test Log Failed


      Now when my system boot, I have several errors messages (see picture attached. Sorry for the quality).

      What do you? The drive is dying? I try a new install on this drive?

      Thanks for your help.
      Images
      • boot_error.jpg

        921.99 kB, 2,048×1,520, viewed 451 times

      The post was edited 1 time, last by Haeris ().

    • Looks like it may be dying, at least the filesystem seems damaged. Did you check the smart values of that drive?

      smartctl -a /dev/sdX replace X with the correct driver letter for your OS Drive.

      Greetings
      David
      "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"

      Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.


      Upload Logfile via WebGUI/CLI
      #openmediavault on freenode IRC | German & English | GMT+1
      Absolutely no Support via PM!

      I host parts of the omv-extras.org Repository, the OpenMediaVault Live Demo and the pre-built PXE Images. If you want you can take part and help covering the costs by having a look at my profile page.
    • Fire up a live cd of any distro of your like. Run the command there.

      Greetings
      David
      "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"

      Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.


      Upload Logfile via WebGUI/CLI
      #openmediavault on freenode IRC | German & English | GMT+1
      Absolutely no Support via PM!

      I host parts of the omv-extras.org Repository, the OpenMediaVault Live Demo and the pre-built PXE Images. If you want you can take part and help covering the costs by having a look at my profile page.
    • Sorry for the delay.

      The result of the command:

      Brainfuck Source Code

      1. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-lowlatency] (local build)
      2. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
      3. === START OF INFORMATION SECTION ===
      4. Model Family: Western Digital Caviar Blue (SATA 6Gb/s)
      5. Device Model: WDC WD5000AAKX-08U6AA0
      6. Serial Number: WD-WCC2EN0TJDFT
      7. LU WWN Device Id: 5 0014ee 2b55806bc
      8. Firmware Version: 19.01H19
      9. User Capacity: 500 107 862 016 bytes [500 GB]
      10. Sector Size: 512 bytes logical/physical
      11. Rotation Rate: 7200 rpm
      12. Device is: In smartctl database [for details use: -P show]
      13. ATA Version is: ATA8-ACS (minor revision not indicated)
      14. SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
      15. Local Time is: Wed Feb 25 17:59:03 2015 UTC
      16. SMART support is: Available - device has SMART capability.
      17. SMART support is: Enabled
      18. === START OF READ SMART DATA SECTION ===
      19. SMART overall-health self-assessment test result: PASSED
      20. General SMART Values:
      21. Offline data collection status: (0x84) Offline data collection activity
      22. was suspended by an interrupting command from host.
      23. Auto Offline Data Collection: Enabled.
      24. Self-test execution status: ( 0) The previous self-test routine completed
      25. without error or no self-test has ever
      26. been run.
      27. Total time to complete Offline
      28. data collection: ( 8700) seconds.
      29. Offline data collection
      30. capabilities: (0x7b) SMART execute Offline immediate.
      31. Auto Offline data collection on/off support.
      32. Suspend Offline collection upon new
      33. command.
      34. Offline surface scan supported.
      35. Self-test supported.
      36. Conveyance Self-test supported.
      37. Selective Self-test supported.
      38. SMART capabilities: (0x0003) Saves SMART data before entering
      39. power-saving mode.
      40. Supports SMART auto save timer.
      41. Error logging capability: (0x01) Error logging supported.
      42. General Purpose Logging supported.
      43. Short self-test routine
      44. recommended polling time: ( 2) minutes.
      45. Extended self-test routine
      46. recommended polling time: ( 88) minutes.
      47. Conveyance self-test routine
      48. recommended polling time: ( 5) minutes.
      49. SCT capabilities: (0x3037) SCT Status supported.
      50. SCT Feature Control supported.
      51. SCT Data Table supported.
      52. SMART Attributes Data Structure revision number: 16
      53. Vendor Specific SMART Attributes with Thresholds:
      54. ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
      55. 1 Raw_Read_Error_Rate 0x002f 199 199 051 Pre-fail Always - 20
      56. 3 Spin_Up_Time 0x0027 137 137 021 Pre-fail Always - 4116
      57. 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 34
      58. 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 2
      59. 7 Seek_Error_Rate 0x002e 028 027 000 Old_age Always - 53681
      60. 9 Power_On_Hours 0x0032 097 097 000 Old_age Always - 2457
      61. 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
      62. 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
      63. 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 33
      64. 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 24
      65. 193 Load_Cycle_Count 0x0032 199 199 000 Old_age Always - 3604
      66. 194 Temperature_Celsius 0x0022 115 106 000 Old_age Always - 28
      67. 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
      68. 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1
      69. 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
      70. 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
      71. 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
      72. SMART Error Log Version: 1
      73. ATA Error Count: 4
      74. CR = Command Register [HEX]
      75. FR = Features Register [HEX]
      76. SC = Sector Count Register [HEX]
      77. SN = Sector Number Register [HEX]
      78. CL = Cylinder Low Register [HEX]
      79. CH = Cylinder High Register [HEX]
      80. DH = Device/Head Register [HEX]
      81. DC = Device Command Register [HEX]
      82. ER = Error register [HEX]
      83. ST = Status register [HEX]
      84. Powered_Up_Time is measured from power on, and printed as
      85. DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
      86. SS=sec, and sss=millisec. It "wraps" after 49.710 days.
      87. Error 4 occurred at disk power-on lifetime: 531 hours (22 days + 3 hours)
      88. When the command that caused the error occurred, the device was active or idle.
      89. After command completion occurred, registers were:
      90. ER ST SC SN CL CH DH
      91. -- -- -- -- -- -- --
      92. 04 51 01 00 00 00 00 Error: ABRT
      93. Commands leading to the command that caused the error were:
      94. CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
      95. -- -- -- -- -- -- -- -- ---------------- --------------------
      96. b0 d5 01 e1 4f c2 00 08 22d+02:20:41.830 SMART READ LOG
      97. b0 d5 01 e1 4f c2 00 08 22d+02:20:41.830 SMART READ LOG
      98. b0 d5 01 e0 4f c2 00 08 22d+02:20:41.830 SMART READ LOG
      99. b0 d6 01 e0 4f c2 00 08 22d+02:20:41.828 SMART WRITE LOG
      100. b0 d6 01 e0 4f c2 00 08 22d+02:20:41.827 SMART WRITE LOG
      101. Error 3 occurred at disk power-on lifetime: 505 hours (21 days + 1 hours)
      102. When the command that caused the error occurred, the device was active or idle.
      103. After command completion occurred, registers were:
      104. ER ST SC SN CL CH DH
      105. -- -- -- -- -- -- --
      106. 04 51 01 00 00 00 00 Error: ABRT
      107. Commands leading to the command that caused the error were:
      108. CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
      109. -- -- -- -- -- -- -- -- ---------------- --------------------
      110. b0 d5 01 e1 4f c2 00 08 20d+23:45:07.020 SMART READ LOG
      111. b0 d5 01 e1 4f c2 00 08 20d+23:45:07.020 SMART READ LOG
      112. b0 d6 01 e0 4f c2 00 08 20d+23:45:07.018 SMART WRITE LOG
      113. b0 d6 01 e0 4f c2 00 08 20d+23:45:07.017 SMART WRITE LOG
      114. b0 d5 01 e0 4f c2 00 08 20d+23:45:07.017 SMART READ LOG
      115. Error 2 occurred at disk power-on lifetime: 504 hours (21 days + 0 hours)
      116. When the command that caused the error occurred, the device was active or idle.
      117. After command completion occurred, registers were:
      118. ER ST SC SN CL CH DH
      119. -- -- -- -- -- -- --
      120. 04 51 01 00 00 00 00 Error: ABRT
      121. Commands leading to the command that caused the error were:
      122. CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
      123. -- -- -- -- -- -- -- -- ---------------- --------------------
      124. b0 d5 01 e1 4f c2 00 08 20d+23:40:35.663 SMART READ LOG
      125. b0 d5 01 e0 4f c2 00 08 20d+23:40:35.663 SMART READ LOG
      126. b0 d5 01 e1 4f c2 00 08 20d+23:40:35.663 SMART READ LOG
      127. b0 d6 01 e0 4f c2 00 08 20d+23:40:35.661 SMART WRITE LOG
      128. b0 d5 01 e1 4f c2 00 08 20d+23:40:35.661 SMART READ LOG
      129. Error 1 occurred at disk power-on lifetime: 504 hours (21 days + 0 hours)
      130. When the command that caused the error occurred, the device was active or idle.
      131. After command completion occurred, registers were:
      132. ER ST SC SN CL CH DH
      133. -- -- -- -- -- -- --
      134. 04 51 01 00 00 00 00 Error: ABRT
      135. Commands leading to the command that caused the error were:
      136. CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
      137. -- -- -- -- -- -- -- -- ---------------- --------------------
      138. b0 d5 01 e1 4f c2 00 08 20d+23:10:17.394 SMART READ LOG
      139. b0 d5 01 e1 4f c2 00 08 20d+23:10:17.394 SMART READ LOG
      140. b0 d6 01 e0 4f c2 00 08 20d+23:10:17.392 SMART WRITE LOG
      141. b0 d6 01 e0 4f c2 00 08 20d+23:10:17.391 SMART WRITE LOG
      142. SMART Self-test log structure revision number 1
      143. Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
      144. # 1 Short offline Completed without error 00% 2452 -
      145. # 2 Short offline Completed without error 00% 2005 -
      146. # 3 Short offline Completed without error 00% 1838 -
      147. # 4 Short offline Completed without error 00% 1670 -
      148. # 5 Short offline Completed without error 00% 1610 -
      149. # 6 Short offline Completed without error 00% 1502 -
      150. # 7 Short offline Completed without error 00% 1334 -
      151. # 8 Short offline Completed without error 00% 1166 -
      152. # 9 Short offline Completed without error 00% 998 -
      153. #10 Short offline Completed without error 00% 830 -
      154. #11 Short offline Completed without error 00% 662 -
      155. #12 Short offline Completed without error 00% 559 -
      156. #13 Short offline Completed without error 00% 541 -
      157. #14 Short offline Completed without error 00% 373 -
      158. #15 Short offline Completed without error 00% 205 -
      159. #16 Short offline Completed without error 00% 37 -
      160. SMART Selective self-test log data structure revision number 1
      161. SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
      162. 1 0 0 Not_testing
      163. 2 0 0 Not_testing
      164. 3 0 0 Not_testing
      165. 4 0 0 Not_testing
      166. 5 0 0 Not_testing
      167. Selective self-test flags (0x0):
      168. After scanning selected spans, do NOT read-scan remainder of disk.
      169. If Selective self-test is pending on power-up, resume after 0 minute delay.
      Display All
    • Haeris wrote:

      7 Seek_Error_Rate 0x002e 028 027 000 Old_age Always - 53681


      The overall doesn't look that bad, allthough it may be slowly dieing, but this seek error rate is much likely your problem. I would say it's time. ;)

      Greetings
      David
      "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"

      Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.


      Upload Logfile via WebGUI/CLI
      #openmediavault on freenode IRC | German & English | GMT+1
      Absolutely no Support via PM!

      I host parts of the omv-extras.org Repository, the OpenMediaVault Live Demo and the pre-built PXE Images. If you want you can take part and help covering the costs by having a look at my profile page.
    • Well this drive is pretty new (3 months old), bad luck for me with this one :(

      Thanks a lot for the help David.


      I have a suggestion for a new feature: A way to save and compare SMART values at several date.

      It could help to see the evolution of the values to track hard drive issue.
    • You can create a feature request on the bugtracker for that. ;)

      Greetings
      David
      "Well... lately this forum has become support for everything except omv" [...] "And is like someone is banning Google from their browsers"

      Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.


      Upload Logfile via WebGUI/CLI
      #openmediavault on freenode IRC | German & English | GMT+1
      Absolutely no Support via PM!

      I host parts of the omv-extras.org Repository, the OpenMediaVault Live Demo and the pre-built PXE Images. If you want you can take part and help covering the costs by having a look at my profile page.