S.M.A.R.T. Red light on system disk, test passed?

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • S.M.A.R.T. Red light on system disk, test passed?

      Hi,

      Just installed OMV 3.0 on a SSD instead of a normal HDD. Like the quicker feeling. But when checking the S.M.A.R.T. in OMV the status light is red, and if I hold over it, it displays:
      "Device has a few bad sectors".

      Did a test and this is the result:

      Source Code

      1. smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.9.0-0.bpo.3-amd64] (local build)
      2. Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
      3. === START OF INFORMATION SECTION ===
      4. Model Family: Intel X18-M/X25-M/X25-V G2 SSDs
      5. Device Model: INTEL SSDSA2M080G2GC
      6. Serial Number: CVPO012300KP080JGN
      7. LU WWN Device Id: 5 001517 959270f96
      8. Firmware Version: 2CV102M3
      9. User Capacity: 80,025,280,000 bytes [80.0 GB]
      10. Sector Size: 512 bytes logical/physical
      11. Rotation Rate: Solid State Device
      12. Device is: In smartctl database [for details use: -P show]
      13. ATA Version is: ATA/ATAPI-7 T13/1532D revision 1
      14. SATA Version is: SATA 2.6, 3.0 Gb/s
      15. Local Time is: Sat Oct 21 11:26:03 2017 CEST
      16. SMART support is: Available - device has SMART capability.
      17. SMART support is: Enabled
      18. AAM feature is: Unavailable
      19. APM feature is: Unavailable
      20. Rd look-ahead is: Enabled
      21. Write cache is: Enabled
      22. ATA Security is: Disabled, NOT FROZEN [SEC1]
      23. Wt Cache Reorder: Unavailable
      24. === START OF READ SMART DATA SECTION ===
      25. SMART overall-health self-assessment test result: PASSED
      26. General SMART Values:
      27. Offline data collection status: (0x00) Offline data collection activity
      28. was never started.
      29. Auto Offline Data Collection: Disabled.
      30. Self-test execution status: ( 0) The previous self-test routine completed
      31. without error or no self-test has ever
      32. been run.
      33. Total time to complete Offline
      34. data collection: ( 1) seconds.
      35. Offline data collection
      36. capabilities: (0x71) SMART execute Offline immediate.
      37. No Auto Offline data collection support.
      38. Suspend Offline collection upon new
      39. command.
      40. No Offline surface scan supported.
      41. Self-test supported.
      42. Conveyance Self-test supported.
      43. Selective Self-test supported.
      44. SMART capabilities: (0x0003) Saves SMART data before entering
      45. power-saving mode.
      46. Supports SMART auto save timer.
      47. Error logging capability: (0x01) Error logging supported.
      48. General Purpose Logging supported.
      49. Short self-test routine
      50. recommended polling time: ( 1) minutes.
      51. Extended self-test routine
      52. recommended polling time: ( 1) minutes.
      53. Conveyance self-test routine
      54. recommended polling time: ( 1) minutes.
      55. SMART Attributes Data Structure revision number: 5
      56. Vendor Specific SMART Attributes with Thresholds:
      57. ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
      58. 3 Spin_Up_Time -----K 100 100 000 - 0
      59. 4 Start_Stop_Count ----CK 100 100 000 - 0
      60. 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2
      61. 9 Power_On_Hours -O--CK 100 100 000 - 3389
      62. 12 Power_Cycle_Count -O--CK 100 100 000 - 1757
      63. 192 Unsafe_Shutdown_Count -O--CK 100 100 000 - 115
      64. 225 Host_Writes_32MiB ----CK 100 100 000 - 70654
      65. 226 Workld_Media_Wear_Indic -O--CK 100 100 000 - 21434651
      66. 227 Workld_Host_Reads_Perc -O--CK 100 100 000 - 0
      67. 228 Workload_Minutes -O--CK 100 100 000 - 4278458468
      68. 232 Available_Reservd_Space PO--CK 100 100 010 - 0
      69. 233 Media_Wearout_Indicator -O--CK 099 099 000 - 0
      70. 184 End-to-End_Error PO--CK 100 100 090 - 0
      71. ||||||_ K auto-keep
      72. |||||__ C event count
      73. ||||___ R error rate
      74. |||____ S speed/performance
      75. ||_____ O updated online
      76. |______ P prefailure warning
      77. General Purpose Log Directory Version 1
      78. SMART Log Directory Version 1 [multi-sector log support]
      79. Address Access R/W Size Description
      80. 0x00 GPL,SL R/O 1 Log Directory
      81. 0x01 GPL,SL R/O 1 Summary SMART error log
      82. 0x02 GPL,SL R/O 8 Comprehensive SMART error log
      83. 0x03 GPL,SL R/O 8 Ext. Comprehensive SMART error log
      84. 0x06 GPL,SL R/O 1 SMART self-test log
      85. 0x07 GPL,SL R/O 1 Extended self-test log
      86. 0x09 GPL,SL R/W 1 Selective self-test log
      87. 0x10 GPL,SL R/O 1 SATA NCQ Queued Error log
      88. 0x11 GPL,SL R/O 1 SATA Phy Event Counters log
      89. 0x80-0x9f GPL,SL R/W 16 Host vendor specific log
      90. SMART Extended Comprehensive Error Log Version: 1 (8 sectors)
      91. No Errors Logged
      92. SMART Extended Self-test Log Version: 1 (1 sectors)
      93. Invalid Self-test Log index = 0x0048 (reserved = 0x00)
      94. SMART Selective self-test log data structure revision number 0
      95. Note: revision number not 1 implies that no selective self-test has ever been run
      96. SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
      97. 1 0 0 Not_testing
      98. 2 0 0 Not_testing
      99. 3 0 0 Not_testing
      100. 4 0 0 Not_testing
      101. 5 0 0 Not_testing
      102. Selective self-test flags (0x0):
      103. After scanning selected spans, do NOT read-scan remainder of disk.
      104. If Selective self-test is pending on power-up, resume after 0 minute delay.
      105. SCT Commands not supported
      106. Device Statistics (GP/SMART Log 0x04) not supported
      107. SATA Phy Event Counters (GP Log 0x11)
      108. ID Size Value Description
      109. 0x0001 4 0 Command failed due to ICRC error
      110. 0x0004 4 0 R_ERR response for host-to-device data FIS
      111. 0x0007 4 0 R_ERR response for host-to-device non-data FIS
      112. 0x0008 4 0 Device-to-host non-data FIS retries
      113. 0x0009 4 8 Transition from drive PhyRdy to drive PhyNRdy
      114. 0x000a 4 9 Device-to-host register FISes sent due to a COMRESET
      115. 0x000b 4 0 CRC errors within host-to-device FIS
      116. 0x000d 4 0 Non-CRC errors within host-to-device FIS
      117. 0x000f 4 0 R_ERR response for host-to-device data FIS, CRC
      118. 0x0010 4 0 R_ERR response for host-to-device data FIS, non-CRC
      119. 0x0012 4 0 R_ERR response for host-to-device non-data FIS, CRC
      120. 0x0013 4 0 R_ERR response for host-to-device non-data FIS, non-CRC
      Display All
      The result is:
      SMART overall-health self-assessment test result: PASSED

      Why is it showing a red light?
    • because of this line: 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2


      see: Red light on a disk, but all green inside
      OMV 3.0.88 x64 on a HP T510, 8GB CF as Boot Disk & 32GB SSD 2,5" disk for Data, 4 GB RAM, CPU VIA EDEN X2 U4200 is x64 at 1GHz

      Post: HPT510 SlimNAS ; HOWTO Install Pi-Hole ; HOWTO install MLDonkey ; HOHTO Install ZFS-Plugin ; OMV_OldGUI ; ShellinaBOX ;
    • joq3 wrote:

      raulfg3 wrote:

      because of this line: 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2


      see: Red light on a disk, but all green inside
      Okey, so something is wrong with the disk after all. It is an 80gb drive, which I am only using for OMV. Should I replace it or will it be fine as I am using so little of it?
      I use it until goes seriously bad, 2 bad sector is not so much, but can be a tendence.
      OMV 3.0.88 x64 on a HP T510, 8GB CF as Boot Disk & 32GB SSD 2,5" disk for Data, 4 GB RAM, CPU VIA EDEN X2 U4200 is x64 at 1GHz

      Post: HPT510 SlimNAS ; HOWTO Install Pi-Hole ; HOWTO install MLDonkey ; HOHTO Install ZFS-Plugin ; OMV_OldGUI ; ShellinaBOX ;
    • macom wrote:

      Might be a good idea to make an image of it using clonezille.
      Is it a HDD? Why not exchange it by an SDD or USB Thumb drive. Might also save some energy.
      It is an SSD, 80gb Intel X25-M (older model).

      raulfg3 wrote:

      joq3 wrote:

      raulfg3 wrote:

      because of this line: 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2


      see: Red light on a disk, but all green inside
      Okey, so something is wrong with the disk after all. It is an 80gb drive, which I am only using for OMV. Should I replace it or will it be fine as I am using so little of it?
      I use it until goes seriously bad, 2 bad sector is not so much, but can be a tendence.
      I will take a Clonezilla backup from time to time, will I get any warnings from OMV if more sectors fail?
    • joq3 wrote:

      will I get any warnings from OMV if more sectors fail?
      SSDs do not have sectors. The only SMART attribute worth a look is '233 Media_Wearout_Indicator' (with Intel SSDs -- other vendors, other attributes). This one goes from 100 down to 1 and is now at 99. As soon as it's at 1 you can take action based on SMART data (based on current usage this will be in a few decades or even +100 years). In other words: forget about SMART as health indicator, it won't work. Do backups and if I were you I would use also a checksummed fs and do regular scrubs (but on an OMV system partition a periodic 'dpkg --verify' should also do the job to alert silent data corruption)
      'OMV problems' with XU4 and Cloudshell 2? Nope, read this first. 'OMV problems' with Cloudshell 1? Nope, just Ohm's law or queue size.
    • tkaiser wrote:

      on an OMV system partition a periodic 'dpkg --verify' should also do the job to alert silent data corruption
      What should the output look like?
      Something like this? If I understood google right the 5 between the ? means that the md5sum check is ok.

      Display Spoiler

      root@bananas:~# dpkg --verify
      ??5?????? c /etc/cron-apt/config
      ??5?????? c /etc/cron-apt/action.d/3-download
      ??5?????? c /etc/lirc/lircd.conf
      ??5?????? c /etc/lirc/hardware.conf
      ??5?????? c /etc/watchdog.conf
      ??5?????? c /etc/default/monit
      ??5?????? c /etc/monit/monitrc
      ??5?????? c /etc/ntp.conf
      ??5?????? c /etc/collectd/collectd.conf
      ??5?????? /etc/cron.daily/log2ram
      ??5?????? /etc/update-motd.d/99-point-to-faq
      ??5?????? /etc/default/cpufrequtils
      ??5?????? c /etc/hdparm.conf
      ??5?????? c /etc/default/smartmontools
      ??5?????? c /etc/smartd.conf
      ??5?????? c /etc/cron.daily/mdadm
      ??5?????? c /etc/sysctl.conf
      ??5?????? c /etc/perl/sitecustomize.pl
      ??5?????? c /etc/avahi/avahi-daemon.conf
      ??5?????? c /etc/default/avahi-daemon
      ??5?????? c /etc/issue
      ??5?????? c /etc/default/acpid
      ??5?????? c /etc/default/collectd
      ??5?????? c /etc/default/rrdcached
      ??5?????? c /etc/init.d/rrdcached
      ??5?????? c /etc/default/openmediavault
      ??5?????? c /etc/rsyslog.conf
      BananaPi - armbian - OMV4.x | Asrock Q1900DC-ITX - 16GB - 2x Seagate ST3000VN000 - 1x Intenso SSD 120GB - OMV3.x 64bit
    • macom wrote:

      Might be a good idea to make an image of it using clonezille.
      Is it a HDD? Why not exchange it by an SDD or USB Thumb drive. Might also save some energy.
      It is an SSD, 80gb Intel X25-M (older model).

      raulfg3 wrote:

      joq3 wrote:

      raulfg3 wrote:

      because of this line: 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2


      see: Red light on a disk, but all green inside
      Okey, so something is wrong with the disk after all. It is an 80gb drive, which I am only using for OMV. Should I replace it or will it be fine as I am using so little of it?
      I use it until goes seriously bad, 2 bad sector is not so much, but can be a tendence.
      I will take a Clonezilla backup from time to time, will I get any warnings from OMV if more sectors fail?

      Great, I will shut off SMART on the SSD system disk then.

      Did what you said. This is the result?
      Does it look good?

      Source Code

      1. root@nas:~# dpkg --verify
      2. ??5?????? c /etc/cron-apt/config
      3. ??5?????? c /etc/default/transmission-daemon
      4. ??5?????? c /etc/transmission-daemon/settings.json
      5. ??5?????? c /etc/watchdog.conf
      6. ??5?????? c /etc/default/monit
      7. ??5?????? c /etc/monit/monitrc
      8. ??5?????? c /etc/ntp.conf
      9. ??5?????? c /etc/collectd/collectd.conf
      10. ??5?????? /var/lib/smartmontools/drivedb/drivedb.h
      11. ??5?????? c /etc/default/smartmontools
      12. ??5?????? c /etc/smartd.conf
      13. ??5?????? c /etc/cron.daily/mdadm
      14. ??5?????? c /etc/default/avahi-daemon
      15. ??5?????? c /etc/avahi/avahi-daemon.conf
      16. ??5?????? c /etc/issue
      17. ??5?????? c /etc/default/acpid
      18. ??5?????? c /etc/default/collectd
      19. ??5?????? c /etc/default/rrdcached
      20. ??5?????? c /etc/default/halt
      Display All
    • macom wrote:

      What should the output look like?
      Not suspicious ;)

      This command lists all files that have not the checksums as 'expected' (at the time the original package was installed). While some files are expected to change (eg. below /etc, the drivedb.h or everything with *cache* in its name) increasing occurences of system files are an indication for data corruption. As an example see here: Protocol "http" not supported or disabled in libcurl
      'OMV problems' with XU4 and Cloudshell 2? Nope, read this first. 'OMV problems' with Cloudshell 1? Nope, just Ohm's law or queue size.