S.M.A.R.T. Red light on system disk, test passed?

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • S.M.A.R.T. Red light on system disk, test passed?

      Hi,

      Just installed OMV 3.0 on a SSD instead of a normal HDD. Like the quicker feeling. But when checking the S.M.A.R.T. in OMV the status light is red, and if I hold over it, it displays:
      "Device has a few bad sectors".

      Did a test and this is the result:

      Source Code

      1. smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.9.0-0.bpo.3-amd64] (local build)
      2. Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
      3. === START OF INFORMATION SECTION ===
      4. Model Family: Intel X18-M/X25-M/X25-V G2 SSDs
      5. Device Model: INTEL SSDSA2M080G2GC
      6. Serial Number: CVPO012300KP080JGN
      7. LU WWN Device Id: 5 001517 959270f96
      8. Firmware Version: 2CV102M3
      9. User Capacity: 80,025,280,000 bytes [80.0 GB]
      10. Sector Size: 512 bytes logical/physical
      11. Rotation Rate: Solid State Device
      12. Device is: In smartctl database [for details use: -P show]
      13. ATA Version is: ATA/ATAPI-7 T13/1532D revision 1
      14. SATA Version is: SATA 2.6, 3.0 Gb/s
      15. Local Time is: Sat Oct 21 11:26:03 2017 CEST
      16. SMART support is: Available - device has SMART capability.
      17. SMART support is: Enabled
      18. AAM feature is: Unavailable
      19. APM feature is: Unavailable
      20. Rd look-ahead is: Enabled
      21. Write cache is: Enabled
      22. ATA Security is: Disabled, NOT FROZEN [SEC1]
      23. Wt Cache Reorder: Unavailable
      24. === START OF READ SMART DATA SECTION ===
      25. SMART overall-health self-assessment test result: PASSED
      26. General SMART Values:
      27. Offline data collection status: (0x00) Offline data collection activity
      28. was never started.
      29. Auto Offline Data Collection: Disabled.
      30. Self-test execution status: ( 0) The previous self-test routine completed
      31. without error or no self-test has ever
      32. been run.
      33. Total time to complete Offline
      34. data collection: ( 1) seconds.
      35. Offline data collection
      36. capabilities: (0x71) SMART execute Offline immediate.
      37. No Auto Offline data collection support.
      38. Suspend Offline collection upon new
      39. command.
      40. No Offline surface scan supported.
      41. Self-test supported.
      42. Conveyance Self-test supported.
      43. Selective Self-test supported.
      44. SMART capabilities: (0x0003) Saves SMART data before entering
      45. power-saving mode.
      46. Supports SMART auto save timer.
      47. Error logging capability: (0x01) Error logging supported.
      48. General Purpose Logging supported.
      49. Short self-test routine
      50. recommended polling time: ( 1) minutes.
      51. Extended self-test routine
      52. recommended polling time: ( 1) minutes.
      53. Conveyance self-test routine
      54. recommended polling time: ( 1) minutes.
      55. SMART Attributes Data Structure revision number: 5
      56. Vendor Specific SMART Attributes with Thresholds:
      57. ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
      58. 3 Spin_Up_Time -----K 100 100 000 - 0
      59. 4 Start_Stop_Count ----CK 100 100 000 - 0
      60. 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2
      61. 9 Power_On_Hours -O--CK 100 100 000 - 3389
      62. 12 Power_Cycle_Count -O--CK 100 100 000 - 1757
      63. 192 Unsafe_Shutdown_Count -O--CK 100 100 000 - 115
      64. 225 Host_Writes_32MiB ----CK 100 100 000 - 70654
      65. 226 Workld_Media_Wear_Indic -O--CK 100 100 000 - 21434651
      66. 227 Workld_Host_Reads_Perc -O--CK 100 100 000 - 0
      67. 228 Workload_Minutes -O--CK 100 100 000 - 4278458468
      68. 232 Available_Reservd_Space PO--CK 100 100 010 - 0
      69. 233 Media_Wearout_Indicator -O--CK 099 099 000 - 0
      70. 184 End-to-End_Error PO--CK 100 100 090 - 0
      71. ||||||_ K auto-keep
      72. |||||__ C event count
      73. ||||___ R error rate
      74. |||____ S speed/performance
      75. ||_____ O updated online
      76. |______ P prefailure warning
      77. General Purpose Log Directory Version 1
      78. SMART Log Directory Version 1 [multi-sector log support]
      79. Address Access R/W Size Description
      80. 0x00 GPL,SL R/O 1 Log Directory
      81. 0x01 GPL,SL R/O 1 Summary SMART error log
      82. 0x02 GPL,SL R/O 8 Comprehensive SMART error log
      83. 0x03 GPL,SL R/O 8 Ext. Comprehensive SMART error log
      84. 0x06 GPL,SL R/O 1 SMART self-test log
      85. 0x07 GPL,SL R/O 1 Extended self-test log
      86. 0x09 GPL,SL R/W 1 Selective self-test log
      87. 0x10 GPL,SL R/O 1 SATA NCQ Queued Error log
      88. 0x11 GPL,SL R/O 1 SATA Phy Event Counters log
      89. 0x80-0x9f GPL,SL R/W 16 Host vendor specific log
      90. SMART Extended Comprehensive Error Log Version: 1 (8 sectors)
      91. No Errors Logged
      92. SMART Extended Self-test Log Version: 1 (1 sectors)
      93. Invalid Self-test Log index = 0x0048 (reserved = 0x00)
      94. SMART Selective self-test log data structure revision number 0
      95. Note: revision number not 1 implies that no selective self-test has ever been run
      96. SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
      97. 1 0 0 Not_testing
      98. 2 0 0 Not_testing
      99. 3 0 0 Not_testing
      100. 4 0 0 Not_testing
      101. 5 0 0 Not_testing
      102. Selective self-test flags (0x0):
      103. After scanning selected spans, do NOT read-scan remainder of disk.
      104. If Selective self-test is pending on power-up, resume after 0 minute delay.
      105. SCT Commands not supported
      106. Device Statistics (GP/SMART Log 0x04) not supported
      107. SATA Phy Event Counters (GP Log 0x11)
      108. ID Size Value Description
      109. 0x0001 4 0 Command failed due to ICRC error
      110. 0x0004 4 0 R_ERR response for host-to-device data FIS
      111. 0x0007 4 0 R_ERR response for host-to-device non-data FIS
      112. 0x0008 4 0 Device-to-host non-data FIS retries
      113. 0x0009 4 8 Transition from drive PhyRdy to drive PhyNRdy
      114. 0x000a 4 9 Device-to-host register FISes sent due to a COMRESET
      115. 0x000b 4 0 CRC errors within host-to-device FIS
      116. 0x000d 4 0 Non-CRC errors within host-to-device FIS
      117. 0x000f 4 0 R_ERR response for host-to-device data FIS, CRC
      118. 0x0010 4 0 R_ERR response for host-to-device data FIS, non-CRC
      119. 0x0012 4 0 R_ERR response for host-to-device non-data FIS, CRC
      120. 0x0013 4 0 R_ERR response for host-to-device non-data FIS, non-CRC
      Display All
      The result is:
      SMART overall-health self-assessment test result: PASSED

      Why is it showing a red light?
    • because of this line: 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2


      see: Red light on a disk, but all green inside
      OMV 3.0.96 x64 on a HP T510, 8GB CF as Boot Disk & 32GB SSD 2,5" disk for Data, 4 GB RAM, CPU VIA EDEN X2 U4200 is x64 at 1GHz

      Post: HPT510 SlimNAS ; HOWTO Install Pi-Hole ; HOWTO install MLDonkey ; HOHTO Install ZFS-Plugin ; OMV_OldGUI ; ShellinaBOX ;
      Dockers: MLDonkey ; PiHole ;
    • joq3 wrote:

      raulfg3 wrote:

      because of this line: 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2


      see: Red light on a disk, but all green inside
      Okey, so something is wrong with the disk after all. It is an 80gb drive, which I am only using for OMV. Should I replace it or will it be fine as I am using so little of it?
      I use it until goes seriously bad, 2 bad sector is not so much, but can be a tendence.
      OMV 3.0.96 x64 on a HP T510, 8GB CF as Boot Disk & 32GB SSD 2,5" disk for Data, 4 GB RAM, CPU VIA EDEN X2 U4200 is x64 at 1GHz

      Post: HPT510 SlimNAS ; HOWTO Install Pi-Hole ; HOWTO install MLDonkey ; HOHTO Install ZFS-Plugin ; OMV_OldGUI ; ShellinaBOX ;
      Dockers: MLDonkey ; PiHole ;
    • macom wrote:

      Might be a good idea to make an image of it using clonezille.
      Is it a HDD? Why not exchange it by an SDD or USB Thumb drive. Might also save some energy.
      It is an SSD, 80gb Intel X25-M (older model).

      raulfg3 wrote:

      joq3 wrote:

      raulfg3 wrote:

      because of this line: 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2


      see: Red light on a disk, but all green inside
      Okey, so something is wrong with the disk after all. It is an 80gb drive, which I am only using for OMV. Should I replace it or will it be fine as I am using so little of it?
      I use it until goes seriously bad, 2 bad sector is not so much, but can be a tendence.
      I will take a Clonezilla backup from time to time, will I get any warnings from OMV if more sectors fail?
    • joq3 wrote:

      will I get any warnings from OMV if more sectors fail?
      SSDs do not have sectors. The only SMART attribute worth a look is '233 Media_Wearout_Indicator' (with Intel SSDs -- other vendors, other attributes). This one goes from 100 down to 1 and is now at 99. As soon as it's at 1 you can take action based on SMART data (based on current usage this will be in a few decades or even +100 years). In other words: forget about SMART as health indicator, it won't work. Do backups and if I were you I would use also a checksummed fs and do regular scrubs (but on an OMV system partition a periodic 'dpkg --verify' should also do the job to alert silent data corruption)
    • tkaiser wrote:

      on an OMV system partition a periodic 'dpkg --verify' should also do the job to alert silent data corruption
      What should the output look like?
      Something like this? If I understood google right the 5 between the ? means that the md5sum check is ok.

      Display Spoiler

      root@bananas:~# dpkg --verify
      ??5?????? c /etc/cron-apt/config
      ??5?????? c /etc/cron-apt/action.d/3-download
      ??5?????? c /etc/lirc/lircd.conf
      ??5?????? c /etc/lirc/hardware.conf
      ??5?????? c /etc/watchdog.conf
      ??5?????? c /etc/default/monit
      ??5?????? c /etc/monit/monitrc
      ??5?????? c /etc/ntp.conf
      ??5?????? c /etc/collectd/collectd.conf
      ??5?????? /etc/cron.daily/log2ram
      ??5?????? /etc/update-motd.d/99-point-to-faq
      ??5?????? /etc/default/cpufrequtils
      ??5?????? c /etc/hdparm.conf
      ??5?????? c /etc/default/smartmontools
      ??5?????? c /etc/smartd.conf
      ??5?????? c /etc/cron.daily/mdadm
      ??5?????? c /etc/sysctl.conf
      ??5?????? c /etc/perl/sitecustomize.pl
      ??5?????? c /etc/avahi/avahi-daemon.conf
      ??5?????? c /etc/default/avahi-daemon
      ??5?????? c /etc/issue
      ??5?????? c /etc/default/acpid
      ??5?????? c /etc/default/collectd
      ??5?????? c /etc/default/rrdcached
      ??5?????? c /etc/init.d/rrdcached
      ??5?????? c /etc/default/openmediavault
      ??5?????? c /etc/rsyslog.conf
      Odroid HC2 - armbian - OMV4.x | Asrock Q1900DC-ITX - 16GB - 2x Seagate ST3000VN000 (rsnapshot) - 1x Intenso SSD 120GB - OMV4.x 64bit
      :!: Backup - Solutions to common problems in OMV - Must see OMV setup videos - OMV4 Documentation :!:
    • macom wrote:

      Might be a good idea to make an image of it using clonezille.
      Is it a HDD? Why not exchange it by an SDD or USB Thumb drive. Might also save some energy.
      It is an SSD, 80gb Intel X25-M (older model).

      raulfg3 wrote:

      joq3 wrote:

      raulfg3 wrote:

      because of this line: 5 Reallocated_Sector_Ct -O--CK 100 100 000 - 2


      see: Red light on a disk, but all green inside
      Okey, so something is wrong with the disk after all. It is an 80gb drive, which I am only using for OMV. Should I replace it or will it be fine as I am using so little of it?
      I use it until goes seriously bad, 2 bad sector is not so much, but can be a tendence.
      I will take a Clonezilla backup from time to time, will I get any warnings from OMV if more sectors fail?

      Great, I will shut off SMART on the SSD system disk then.

      Did what you said. This is the result?
      Does it look good?

      Source Code

      1. root@nas:~# dpkg --verify
      2. ??5?????? c /etc/cron-apt/config
      3. ??5?????? c /etc/default/transmission-daemon
      4. ??5?????? c /etc/transmission-daemon/settings.json
      5. ??5?????? c /etc/watchdog.conf
      6. ??5?????? c /etc/default/monit
      7. ??5?????? c /etc/monit/monitrc
      8. ??5?????? c /etc/ntp.conf
      9. ??5?????? c /etc/collectd/collectd.conf
      10. ??5?????? /var/lib/smartmontools/drivedb/drivedb.h
      11. ??5?????? c /etc/default/smartmontools
      12. ??5?????? c /etc/smartd.conf
      13. ??5?????? c /etc/cron.daily/mdadm
      14. ??5?????? c /etc/default/avahi-daemon
      15. ??5?????? c /etc/avahi/avahi-daemon.conf
      16. ??5?????? c /etc/issue
      17. ??5?????? c /etc/default/acpid
      18. ??5?????? c /etc/default/collectd
      19. ??5?????? c /etc/default/rrdcached
      20. ??5?????? c /etc/default/halt
      Display All
    • macom wrote:

      What should the output look like?
      Not suspicious ;)

      This command lists all files that have not the checksums as 'expected' (at the time the original package was installed). While some files are expected to change (eg. below /etc, the drivedb.h or everything with *cache* in its name) increasing occurences of system files are an indication for data corruption. As an example see here: Protocol "http" not supported or disabled in libcurl