[SOLVED] Disks do not spin down automatically

    • OMV 3.x
    • Resolved
    • [SOLVED] Disks do not spin down automatically

      Hi all,
      I am setting up my second OMV server.
      This time I am trying to set a spin-down time for all the data disks.

      My configuration is:
      - No RAID
      - One SSD drive for the operating system + 5 individual SATA drives.
      - Drives are 3x8TB WD RED (5400 rpm) + 1x2.0TB Seagate + 1x1.5TB Seagate
      - SMART is enabled on all the disks with the following parameters: Power Mode = STANDBY and Check Interval = 1800
      - No scheduled jobs

      So, for every disk, I have set in Storage -> Physical Disks -> Physical disk Properties:
      APM = 1 - Minimum power usage with standby (spindown)
      AAM = Disabled
      Spindown time = 60 minutes

      But the disks never spin down. They always stay on.

      I tried hddparm -C /dev/sdX and can confirm that the disks are ACTIVE/IDLE
      I even tried to manually spin down the disks but after few seconds they have turned on (an exception has been logged in the system log).

      I don't know if it related, but using iotop, I also noticed that every 10 seconds there are 2 tasks that come up:
      - rrdcached -l unix:/var/run/rrdcached.sock -j ~/ -B -w 900 -f 3600 -p /var/run/rrdcached.pid
      - jbd2/sda1-8

      Also in the system log there are some CRON jobs running. I haven't set any myself. Probably they are system CRON jobs.

      Any idea?
    • Hi @tkaiser,
      how can the all 5 data disk have activities?
      I just installed OMV with standard parameters.
      No tweaking yet.
      No access from clients, as there are no clients yet.
      Also, one of the disks is NOT even referenced. Only file system created, no shared folder created. So there could not be activities from outside OMV core accessing it.

      I will try iosnoop, but as a completely new installation there should be no IO access to data disks.

      What I am doing wrong?
    • tkaiser wrote:

      Better ask iosnoop than people far away, having no access to your system and not the slightest idea how you configured stuff.
      Hi @tkasier,
      I have been trying to run iosnoop but I always get permission denied.
      Scripts downloaded from github.com/brendangregg/perf-tools.

      Run as root and as my local user with sudo, all these commands give me the same error: "Permission denied":
      iosnoop, ./iosnoop, sudo iosnoop, sudo ./iosnoop

      If I type sh iosnoop or sudo sh iosnoop or sudo sh ./iosnoop or sudo sh iosnoop, then I get
      ./iosnoop: 64: ./iosnoop: function: not found

      What I am missing?
      I could't find help on other forums.
    • oppo1967o wrote:

      What I am missing?
      If you did not cloned the repo then most probably just the executable bit? And trying to use sh instead of bash won't work. If you're where the iosnoop script is the following should fix the access bits and functionality:

      Source Code

      1. chmod 755 iosnoop
      2. sudo ./iosnoop

      BTW: A translation of major:minor numbers iosnoop will show to usual device names can be done with eg.

      Source Code

      1. for device in $(ls /sys/dev/block/) ; do (source /sys/dev/block/${device}/uevent ; echo -e "${DEVNAME}\t${device}"); done
    • Hi ,

      tkaiser wrote:

      If you did not cloned the repo then most probably just the executable bit? And trying to use sh instead of bash won't work. If you're where the iosnoop script is the following should fix the access bits and functionality:
      thanks for your quick reply.
      I had already set iosnoop to 777 before launching it.
      -rwxrwxrwx 1 1000 100 9112 set 6 10:33 iosnoop

      But doesn't work. "Permission denied"
      Even ./iosnoop -h or sudo ./iosnoop -h gives "Permission denied"

      Also running the commands logged as root gives Permission denied.

      I am puzzled...

      P.S. The second script works and gave me the major:minor translation
    • Hi Community,
      I am writing again to get support on my problem.

      As in the first post, my data disks do not spin down. I have set all the relevant parameters, but simply they always stay active.

      I have tried to get support on this issue in this thread, but I have received an answer to use an external tool to discover what is writing to the disks.
      Unfortunately the tool does not run.

      Also, there is nothing writing to the disks as there is no read/write noise coming from the disks.
      And one of the disks is also not referenced, so nothing can write to it.

      So I am back to step 0 and need to get support.

      What can I do to solve my problem?

      thanks a lot.
    • Hi,
      this is the result of smartctl --all on one of the disks (the one that is not referenced):

      Source Code

      1. xxxxxxx@openmediavault:~/perf-tools$ sudo smartctl --all /dev/sdd
      2. smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.9.0-0.bpo.3-amd64] (local build)
      3. Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
      5. Model Family: Western Digital Red
      6. Device Model: WDC WD80EFZX-68UW8N0
      7. Serial Number: XXXXXXXXXX
      8. LU WWN Device Id: 5 000cca 3b7cb57d1
      9. Firmware Version: 83.H0A83
      10. User Capacity: 8.001.563.222.016 bytes [8,00 TB]
      11. Sector Sizes: 512 bytes logical, 4096 bytes physical
      12. Rotation Rate: 5400 rpm
      13. Form Factor: 3.5 inches
      14. Device is: In smartctl database [for details use: -P show]
      15. ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
      16. SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
      17. Local Time is: Tue Oct 24 13:03:16 2017 CEST
      18. SMART support is: Available - device has SMART capability.
      19. SMART support is: Enabled
      21. SMART overall-health self-assessment test result: PASSED
      22. General SMART Values:
      23. Offline data collection status: (0x82) Offline data collection activity
      24. was completed without error.
      25. Auto Offline Data Collection: Enabled.
      26. Self-test execution status: ( 0) The previous self-test routine completed
      27. without error or no self-test has ever
      28. been run.
      29. Total time to complete Offline
      30. data collection: ( 101) seconds.
      31. Offline data collection
      32. capabilities: (0x5b) SMART execute Offline immediate.
      33. Auto Offline data collection on/off support.
      34. Suspend Offline collection upon new
      35. command.
      36. Offline surface scan supported.
      37. Self-test supported.
      38. No Conveyance Self-test supported.
      39. Selective Self-test supported.
      40. SMART capabilities: (0x0003) Saves SMART data before entering
      41. power-saving mode.
      42. Supports SMART auto save timer.
      43. Error logging capability: (0x01) Error logging supported.
      44. General Purpose Logging supported.
      45. Short self-test routine
      46. recommended polling time: ( 2) minutes.
      47. Extended self-test routine
      48. recommended polling time: (1209) minutes.
      49. SCT capabilities: (0x003d) SCT Status supported.
      50. SCT Error Recovery Control supported.
      51. SCT Feature Control supported.
      52. SCT Data Table supported.
      53. SMART Attributes Data Structure revision number: 16
      54. Vendor Specific SMART Attributes with Thresholds:
      56. 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
      57. 2 Throughput_Performance 0x0005 129 129 054 Pre-fail Offline - 124
      58. 3 Spin_Up_Time 0x0007 209 209 024 Pre-fail Always - 183 (Average 448)
      59. 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 15
      60. 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
      61. 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
      62. 8 Seek_Time_Performance 0x0005 128 128 020 Pre-fail Offline - 18
      63. 9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 207
      64. 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
      65. 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 13
      66. 22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
      67. 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 31
      68. 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 31
      69. 194 Temperature_Celsius 0x0002 166 166 000 Old_age Always - 36 (Min/Max 21/44)
      70. 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
      71. 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
      72. 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
      73. 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
      74. SMART Error Log Version: 1
      75. No Errors Logged
      76. SMART Self-test log structure revision number 1
      77. No self-tests have been logged. [To run self-tests, use: smartctl -t]
      78. SMART Selective self-test log data structure revision number 1
      80. 1 0 0 Not_testing
      81. 2 0 0 Not_testing
      82. 3 0 0 Not_testing
      83. 4 0 0 Not_testing
      84. 5 0 0 Not_testing
      85. Selective self-test flags (0x0):
      86. After scanning selected spans, do NOT read-scan remainder of disk.
      87. If Selective self-test is pending on power-up, resume after 0 minute delay.
      Display All

      Just to make sure I make it right:
      You are suggesting to disable SMART from the web interface for the device and
      then activate the smart from terminal using smartctl -s on /dev/sdX ?
      What do you mean with '-o' section? Do I have to set the '-o' flag too? on or off?

    • oppo1967o wrote:

      You are suggesting to disable SMART from the web interface for the device and
      then activate the smart from terminal using smartctl -s on /dev/sdX ?
      No, I was suggesting to read the full -o section of the smartctl manual page to get an idea what happens wrt -s on/off:

      Source Code

      1. -o VALUE, --offlineauto=VALUE
      2. [ATA only] Enables or disables SMART automatic offline test, which scans the drive every four hours for disk defects. This command can be given during normal system operation. The valid arguments to this option are on and off.
      3. Note that the SMART automatic offline test command is listed as "Obsolete" in every version of the ATA and ATA/ATAPI Specifications. It was originally part of the SFF-8035i Revision 2.0 specification, but was never part of any ATA specification. However it is implemented and used by many vendors. You can tell if automatic offline testing is supported by seeing if this command enables and disables it, as indicated by the 'Auto Offline Data Collection' part of the SMART capabilities report (displayed with '-c').
      4. SMART provides three basic categories of testing. The first category, called "online" testing, has no effect on the performance of the device. It is turned on by the '-s on' option.
      5. The second category of testing is called "offline" testing. This type of test can, in principle, degrade the device performance. The '-o on' option causes this offline testing to be carried out, automatically, on a regular scheduled basis. Normally, the disk will suspend offline testing while disk accesses are taking place, and then automatically resume it when the disk would otherwise be idle, so in practice it has little effect. Note that a one-time offline test can also be carried out immediately upon receipt of a user command. See the '-t offline' option below, which causes a one-time offline test to be carried out immediately.
      6. The choice (made by the SFF-8035i and ATA specification authors) of the word testing for these first two categories is unfortunate, and often leads to confusion. In fact these first two categories of online and offline testing could have been more accurately described as online and offline data collection.
      7. The results of this automatic or immediate offline testing (data collection) are reflected in the values of the SMART Attributes. Thus, if problems or errors are detected, the values of these Attributes will go below their failure thresholds; some types of errors may also appear in the SMART error log. These are visible with the '-A' and '-l error' options respectively.
      8. Some SMART attribute values are updated only during off-line data collection activities; the rest are updated during normal operation of the device or during both normal operation and off-line testing. The Attribute value table produced by the '-A' option indicates this in the UPDATED column. Attributes of the first type are labeled "Offline" and Attributes of the second type are labeled "Always".
      9. The third category of testing (and the only category for which the word 'testing' is really an appropriate choice) is "self" testing. This third type of test is only performed (immediately) when a command to run it is issued. The '-t' and '-X' options can be used to carry out and abort such self-tests; please see below for further details.
      10. Any errors detected in the self testing will be shown in the SMART self-test log, which can be examined using the '-l selftest' option.
      11. Note: in this manual page, the word "Test" is used in connection with the second category just described, e.g. for the "offline" testing. The words "Self-test" are used in connection with the third category.
      Display All
      And maybe it would help others willing to help to get an idea about your setup. Eg. all disks behind a RAID controller in JBOD mode (actively preventing disk spindown) or stuff like that...
    • Hi,
      I have been reading the smartctl manual to understand what could possibly causing the non spin down of the hard drive.
      I disabled SMART in genral settings and disabled SMART for the single HD, but if I type smartctl --info I still get SMART support is: Enabled.
      I tried hard but I don't understand what I should be looking.

      I am not using RAID. I just use single disks mounted.

      Honestly I just wanted to install a NAS software, I am not an IO expert.
      I need help to debug my current the setup. What tests should I perform?
      Maybe a BIOS setting?

      thanks a lot.
    • I am not sure if it is correct or not, but I was reading some remarks that WD reds are not friends with hdparm. Hdparm is the program used to spin down the drives.
      An alternative would be to use hd-idle instead.

      You can serarch the forum, there is an installation guide for hd-idle.
      Odroid HC2 - armbian - Seagate ST4000DM004 - OMV4.x
      Asrock Q1900DC-ITX - 16GB - 2x Seagate ST3000VN000 - Intenso SSD 120GB - OMV4.x
      :!: Backup - Solutions to common problems - OMV setup videos - OMV4 Documentation - user guide :!:
    • Hi all,
      I think that I solved the problem.
      @macom: hdparm is perfectly compatible with WD.RED disks, at least on mine.

      The problem was related to the SMART check interval.
      As I explained in my first post, it was set to 1800 seconds while the spin down was set to 60 minutes.

      What is not documented is the fact that the SMART refresh interval resets the counter of the 60 minutes... even if there is no disk access so the condition is never met.

      Working settings:
      Spin down time = 60 minutes
      Refresh interval = 4000 seconds

      Hope this can help others.