Why is my HDD waking up?

    • Why is my HDD waking up?

      Hi!
      I just finished building my first nas with a Rock64 with emmc, one 3tb WD elements external hard drive, one 4tb WD red with the HD622 board.
      The issue I'm having is that something wakes up my drives and I can't figure out what is it. I'm using a new installation of OMV from sourceforge with omv extras, flashmemory, shell in a box, docker (with no containers for now), smb and ssh running.
      Changes in the "Physical disk properties" seem to have no effect (I tried setting spindown time to "disabled" but the disks are spinning down anyway). I'm guessing that the usb to sata controllers have spindown built in and it can't be changed in OMV. Anyway, I'm not really interessed in turning it off, I just want to know why the disks spin back up for no apparent reason.
      I currently have S.M.A.R.T. enabled (check interval: 1800, power mode: standby). I tried disabling it but that didn't work. I tried with iosnoop but nothing came up.

      Source Code

      1. root@rock64:~# ./iosnoop
      2. Tracing block I/O. Ctrl-C to end.
      3. COMM PID TYPE DEV BLOCK BYTES LATms
      4. mmcqd/0 156 WS 179,0 478968 65536 2.00
      5. mmcqd/0 156 WS 179,0 479096 65536 1.71
      6. mmcqd/0 156 WS 179,0 478968 65536 1.78
      7. mmcqd/0 156 WS 179,0 479096 65536 1.67
      8. mmcqd/0 156 WS 179,0 479096 65536 1.58
      9. mmcqd/0 156 WSM 179,0 3626304 32768 0.93
      10. mmcqd/0 156 WSM 179,0 4150592 32768 1.76
      11. mmcqd/0 156 WSM 179,0 3626368 245760 3.56
      12. mmcqd/0 156 WSM 179,0 4150656 245760 5.22
      13. mmcqd/0 156 FWS 179,0 18446744073709551615 0 0.15
      14. mmcqd/0 156 WFS 179,0 163968 4096 0.81
      15. mmcqd/0 156 WS 179,0 294912 4096 0.59
      Display All


      I found this thread but I can't get the script to work:

      Source Code

      1. root@rock64:~# ./find_culprit.sh sdb
      2. -bash: ./find_culprit.sh: /bin/sh^M: bad interpreter: No such file or directory
      I'm new to Linux so I must be doing something wrong. What I did was: create the file "find_culprit.sh" with notepad++ on windows and move it to /root using filezilla, chmod 755 find_culprit.sh, ./find_culprit.sh sdb. Again, I don't know if this is the right way to do it but this is what I figure out and it took me a long time to do so because I'm a total noob :) .
      I also found this thread where this user had the same issue I have and managed to track it down to "webmin". I have never heard of webmin and I am pretty sure I haven't installed it on my system.

      I'm looking for some tips.

      Thanks!
    • FabrizioMaurizio wrote:

      What I did was: create the file "find_culprit.sh" with notepad++ on windows
      There's a general problem since Windows uses different 'new line' characters compared to Unix/Linux: en.wikipedia.org/wiki/Newline#Representation

      Since I don't use Windows I've no idea whether your program allows you to choose Unix/Linux style (LF) instead of Windows (CR LF) when saving files.
    • OMV uses the program hdparm to send the changed "Physical disk properties" to the disk. It seems that hdparm is not compatible with some HDDs. In some cases it manifests by the HDD unmounting.

      So first I'd try to disable everything on the page for "Physical disk properties".

      If a HDD is spun down, then running a SMART test on it may spin it up. So disable any and all SMART tests. Or possibly set a long interval between tests. Like 24 hours.

      Then try to reboot and see if things work better.
      OMV 4, 7 x ODROID HC2, 1 x ODROID HC1, 3 x 12TB, 2 x 8TB, 1 x 4TB, 1 x 2TB SSHD, 1 x 500GB SSD, GbE, WiFi mesh
    • I usually copy a script by the the small sign in the upper right corner of the "source code" box. Then paste it in a plain text file.

      I have attached the file below for you. Please rename it (remove the .txt) and make it executable. It is not mandatory to have an .sh file extension it also should work without any. Create a different directory than / (which is root) for scripts e.g. "/scripts", copy the script to this folder and chdir to this folder: "chdir /scripts".

      I can´t remember ever having problems with line feed characters. I am using WinSCP to connect to my OMV box. You could also try to create a new empty file directly at the target folder by WinSCP or any other tool which is suitable for unix files and then paste the content of the script directly to this file.
      Files
      • find_culprit.txt

        (858 Byte, downloaded 19 times, last: )
      OMV 3.0.90 (Gray style)
      ASRock Rack C2550D4I - 16GB ECC - 6x WD RED 3TB (ZFS 2x3 Striped RaidZ1)- Fractal Design Node 304

      The post was edited 1 time, last by cabrio_leo ().

    • tkaiser wrote:

      There's a general problem since Windows uses different 'new line' characters compared to Unix/Linux
      That was the issue. Luckily notepad++ allows to choose the new line character, so that was an easy fix.

      cabrio_leo wrote:

      Create a different directory than / (which is root) for scripts e.g. "/scripts", copy the script to this folder and chdir to this folder: "chdir /scripts".
      Thanks for the advice!

      Adoby wrote:

      So first I'd try to disable everything on the page for "Physical disk properties".

      If a HDD is spun down, then running a SMART test on it may spin it up. So disable any and all SMART tests. Or possibly set a long interval between tests. Like 24 hours.
      Ok, so I disabled everything in "Physical disk properties", turned off SMART and rebooted. Then I run the find_culprit script and it looked like it was working.

      Source Code

      1. root@rock64:/scripts# ./find_culprit sda
      2. Putting the disk sda into standby...
      3. Checking the status of sda.
      4. Drive is still in standby. Sleeping 45 seconds...
      5. Drive is still in standby. Sleeping 45 seconds...
      But when the disk woke up, the script didn't show any message, it kept saying "drive is still in standby". I then ran hdparm -C /dev/sda and it also reported the drive to be sleeping even if it actually was active.

      Source Code

      1. root@rock64:~# hdparm -C /dev/sda
      2. /dev/sda:
      3. drive state is: standby

      The post was edited 1 time, last by FabrizioMaurizio ().

    • The user in this thread also had the same issue I'm having and he managed to find out what was causing it. But I didn't quite catch how he managed to do that.

      lgr37 wrote:

      Well, I was digging a little bit deeper into the system and disabled temporarily the services collectd and monit. But this wasn't the clue.
      Then I read the great how to including test script in the following thread:
      My Guide to Debugging Disk Spin-ups

      So I have found, that "parted" was checking my devices over and over again. Unfortunately I was not able to discover which process was triggering parted. I've simply disabled the system tool via renaming it inside /sbin.

      I'm still curious, what's behind this process. So if anybody has an idea I'd be glad to read it here..
    • I think the issue is gone. I renamed "parted", disabled smb and deleted the active rsync jobs and then the issue disappeared. I then undid each change one by one to find the culprit but the issue never came back. So now it's gone but I don't know what has caused it in the first place. I have the smb share mounted on my nvidia shield so it could have been that even though I'm pretty sure there wasn't any smb activity on iosnoop. Oh well at least it's gone. Thank you all for your time.
    • I need to test this more but I think SMART was causing the disk to spin up. Anyway, I need to change the default spindown time for the hd622 (which currently is 10 minutes) to something higher like 60 minutes. I tried hdparm and hd-idle but neither of them work. Can anyone suggest a solution (changing the hd622 for something else is also an accepted solution)?
    • As I know this tool is the successor of the older wdidle3 tool and it increases the timer on the WD RED from some seconds to 300s. That means the heads are no more moved to the parking position every some seconds but after 5 minutes.

      You can query the actual head parking time of the WD REDs also with the wdidle3 tool.

      For my opinion the head parking "feature" is independant from spindown. This means the heads are parked while the drive is still spinning. But I could also be wrong.
      OMV 3.0.90 (Gray style)
      ASRock Rack C2550D4I - 16GB ECC - 6x WD RED 3TB (ZFS 2x3 Striped RaidZ1)- Fractal Design Node 304
    • Yeah I've read about that. I tried idle3 to query the parking time but it doesn't work. I guess the hard drive must be attached to a sata controller and not a sata to usb adapter.
      However, I've read that the low head parking time was only an issue on the early versions of the reds and greens and that WD are selling reds with 300 seconds head parking time from factory. I noticed that roughly 5 minutes after the disk wakes up, the disk makes a click noise. That might be the head parking noise. The disk continues spinning for another 5 minutes and then it goes to sleep.
      I'm gonna give the WD utility a shot for confirmation after the backup is done.
      I'm afraid that the spindown time is handled by the sata to usb adapter and it cannot be changed. Thus, I need to buy a different one. I'm also afraid I need to settle for a single sata port. Can you recommend me one?
    • FabrizioMaurizio wrote:

      I guess the hard drive must be attached to a sata controller and not a sata to usb adapter.
      Most likely.

      FabrizioMaurizio wrote:

      That might be the head parking noise
      Yes.

      FabrizioMaurizio wrote:

      I'm gonna give the WD utility a shot for confirmation after the backup is done.
      Probably nothing will change if you WD Reds are working with 300s already. But to have a backup is always a useful thing :)
      OMV 3.0.90 (Gray style)
      ASRock Rack C2550D4I - 16GB ECC - 6x WD RED 3TB (ZFS 2x3 Striped RaidZ1)- Fractal Design Node 304
    • Update: I've hooked up the hard drive to my desktop pc but unfortunately it can't be detected. I'm guessing it's because the pc is running windows and the drive is formated in ext4. I don't think it's worth wasting time trying to workaround this problem since I'm pretty sure this won't solve my issue which is raising the spindown time.

      I've asked on the pine64 forum if their usb to sata works with hdparm but I've got no answer. @tkaiser I noticed you reviewed it, can you please tell me if hdparm works as intended with it?
    • FabrizioMaurizio wrote:

      I've asked on the pine64 forum if their usb to sata works with hdparm but I've got no answer. @tkaiser I noticed you reviewed it, can you please tell me if hdparm works as intended with it?
      No idea since this always depends on the firmware of such USB-to-SATA bridges (see the spindown hassles ODROID HC1/HC2 have/had which also rely on JMS578). Maybe you need to flash a different firmware? But if I understand correctly your problem is your disks spinning up while they shouldn't (which is somewhat unrelated to hdparm issues like drives not going to sleep while they should)
    • tkaiser wrote:

      But if I understand correctly your problem is your disks spinning up while they shouldn't
      That was my initial problem but I think I figured that out. I think it was caused by SMART, not 100% sure though. My problem now is that the disk spins down after only 10min of inactivity and there seems to be no way of increasing that. I can settle for an adapter that doesn't work with hdparm but has a higher spindown time by default. Do you happen to know what is the factory spindown time for the pine64 adapter?
    • Ok, I hear you. I just realized that pine64 charges 12$ for shipping so their cable is not an option anymore :) . I think I'm going to try my luck on amazon and buy something with the jms578 chip. At worst I can return it.

      Meanwhile my drive has been randomly waken up by jbd2:

      Source Code

      1. kworker/u8:1 14923 WM 8,16 3879733856 8192 0.78
      2. kworker/u8:1 14923 WM 8,16 3879733896 4096 0.82
      3. kworker/u8:1 14923 WM 8,16 3883927808 16384 0.90
      4. kworker/u8:1 14923 WM 8,16 3883927856 4096 0.91
      5. jbd2/sdb1-62 622 WS 8,16 3905922616 98304 0.55
      6. jbd2/sdb1-62 622 FWS 8,16 18446744073709551615 0 15.86
      7. <idle> 0 WS 8,16 3905922808 4096 0.25
      8. cat 14442 FWS 8,16 18446744073709551615 0 10.77
      9. kworker/u8:2 15047 WM 8,16 3879733576 4096 0.32
      10. kworker/u8:2 15047 WM 8,16 3879733592 20480 0.55
      11. kworker/u8:2 15047 WM 8,16 3879733640 4096 0.62
      12. kworker/u8:2 15047 WM 8,16 3879733680 4096 0.60
      13. kworker/u8:2 15047 WM 8,16 3879733720 28672 0.74
      14. kworker/u8:2 15047 WM 8,16 3879733808 8192 0.75
      15. kworker/u8:2 15047 WM 8,16 3879733880 4096 0.75
      16. kworker/u8:2 15047 WM 8,16 3879733928 4096 0.75
      17. kworker/u8:2 15047 WM 8,16 3879733984 4096 0.74
      18. kworker/u8:2 15047 WM 8,16 3879734008 4096 0.72
      19. kworker/u8:2 15047 WM 8,16 3883927864 4096 0.70
      20. cat 14442 WM 8,16 3883927848 4096 0.74
      21. jbd2/sdb1-62 622 WS 8,16 3905922816 12288 0.30
      22. jbd2/sdb1-62 622 FWS 8,16 18446744073709551615 0 6.14
      23. <idle> 0 WS 8,16 3905922840 4096 0.24
      24. cat 14442 FWS 8,16 18446744073709551615 0 10.77
      25. mmcqd/0 171 WS 179,0 6165456 65536 2.15
      26. mmcqd/0 171 WS 179,0 6165584 131072 2.58
      Display All
    • You can run a search for accessed files during the last n minutes. That might help figuring out what happens. This command will list files accessed in /sharedfolders the last two minutes:

      find /sharedfolders -amin -1

      Run the command after the hdd wakes up. Might give a clue?
      OMV 4, 7 x ODROID HC2, 1 x ODROID HC1, 3 x 12TB, 2 x 8TB, 1 x 4TB, 1 x 2TB SSHD, 1 x 500GB SSD, GbE, WiFi mesh