S.M.A.R.T Temperature Alarms

  • I have S.M.A.R.T enabled and I started seeing these logs for one of my drives. According to the S.M.A.R.T page, the temperature of the drive is at 47 deg C.


    smartd[2857]: Device: /dev/disk/by-id/usb-WD_My_Book_25EE_574343374B37444E58523048-0:0 [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 105 to 106


    Is there something really wrong with the drive or I am getting false notifications?

  • Where are you seeing 47c? 105/106f is 40/41c.


    47c is way too hot. 40/41c isn’t great either, but let’s look at why the temp is that high.


    Here are the questions that everyone in your position should run through when in a similar situation:


    What’s the ambient temp?
    Is this a warm time of year where I am located? I.e is this rise out of the norm?
    What case do you use? I.e adequate cooling?
    What model HDD is it?
    If there are other drives, what do they read at the same time and under the same conditions?
    How long has the HDD been powered on?
    Is the drive under load?


    You could also check smart for:


    How many power on hours has the drive endured?
    What is the load cycle count of the drive?
    How high (temp) has the drive been in the past?


    Also check other smart values to see if any are showing signs of failure.


    If, for example, you have decent cooling, low ambient temps, are in a cooler part of the world, the drive wasn’t under load, hasn’t been powered on for long and you have 3 other identical disks in identical conditions with temps 10c lower, then 40/41c isn’t great - replace the drive.


    If, however, you’re somewhere warm, you have a budget chassis with mediocre cooling that’s next to a rad (really bad idea anyway), you’d just copied a terabyte of data to it, and it’s been on for a day/week/month then it’s not as alarming but you need to change the environment for the HDD as I would ideally want that to come down AT LEAST 5/6c but it’s probably not in imminent danger of croaking it. My HDDs sit in the 20s most the time. On a really hot day low 30s MAX. When the cold weather hits they drop to the teens.


    Now a more specific answer in your case:


    You are using an external HDD in a airflow restricted case without any cooling (unless one of the My Book Series has a fan, but I don’t think so). Manufacturers do this because it’s a consumer product and consumers don’t like those pesky noisy troublesome fans. Quiet HDDs are not cool, happy hard drives - fact. Passively cooled HDDs are a con. Manufacturers also love plastic because it’s cheap, but this encases your data in a thermal non-conducive tomb - forget that the HDD may be in contact with metal inside, the outer layer is plastic and plastic does not have good thermal conductivity. I’m not surprised your drive is at 41c, it’s not great, but it’s understandable. It’s most likely totally environmental. It could go on for years.... might not though. If you Google “WD My Book” you get a lot of “recertified” units... heat death? As a really quick experiment, try pointing a desk fan so it’s angled to blow over the top of the vent on the top edge but slightly hitting that top edge so some air goes down into it - or even just at the entire unit, or experiment etc. You NEED to get some air over the surface of those drives in an ideal world. If it’s a model with an openable top, get that open and direct air down into it. This is by no means ideal air flow, but it might help you identify if the plastic tomb is the culprit, which it probably is.




    Sent from my iPhone using Tapatalk

  • Where are you seeing 47c? 105/106f is 40/41c.

    SMART temperature readouts should be in Celsius by definition? A full smartctl -x (not -a) would be the first thing to ask for in my opinion. Those external USB WD drives are known to ship with a broken firmware...


    @utamav Can you please login locally or via SSH and given that your data drive is /dev/sdb provide the output from


    Code
    smartctl -q noserial -d sat -x /dev/sdb

    (you may need to adjust 'sdb' of course)

  • SMART temperature readouts should be in Celsius by definition? A full smartctl -x (not -a) would be the first thing to ask for in my opinion. Those external USB WD drives are known to ship with a broken firmware...


    Very true, if that’s the case - I suspect that’s a normalised value or like you say, a broken firmware or firmware quirk. I don’t use WD and know nothing about their firmware/issues, but it could be that it is similar to some of the seagate “outrageous” smart values that we sometimes see. IIRC, on seagate, some of the drives had/have a ridiculously high raw read error count (like, millions on a brand new drive) but it’s not an indication of drive health. It’s just seagate being monkeys. :)




    Sent from my iPhone using Tapatalk

  • I don’t use WD and know nothing about their firmware/issues

    That's mostly related to their USB3 disks and problems with UAS (USB Attached SCSI). But 'smartctl -x' should provide more detailed output and usually WD disks are able to record thermal history. See here for an example in the code block: https://forum.odroid.com/viewt…?t=26016&p=188210#p186635 (so the next step to ask would then maybe 'smartctl -l scttemp /dev/sdb')


    on seagate, some of the drives had/have a ridiculously high raw read error count (like, millions on a brand new drive) but it’s not an indication of drive health. It’s just seagate being monkeys.

    Nope, that's just users doing wrong things: 'Interpreting' SMART raw values in a wrong way. Those Seagate numbers when viewed in hexadecimal instead of decimal start to make sense since Seagate uses single attributes to transport more than one value: http://www.users.on.net/~fzabk…Seagate_SER_RRER_HEC.html (TL;DR: version here)

  • Thank you. The relevant part of the 'smart -x' output as follows:

    So currently your HDD is at either 46°C or 47°C which is fine. 'Min/Max Temperature Limit' is obviously bogus or how did you manage to operate your HDD at -41°C? I wouldn't trust in. Recorded maximum temperature (AFAIK over whole lifetime) is 85°C which is not that great.


    Wrt the emails you get. It seems the notification feature chooses to focus on 'VALUE' and not 'RAW VALUE' (which would be correct for temperature)


    Code
    ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
    194 Temperature_Celsius     -O---K   106   099   000    -    46


    So my personal recommendation is to switch off these notifications since only misleading. If your disk is operated at a constant temperature level that's fine. Some people might disagree and quote some studies done on the subject. My personal opinion is to better ignore them since authors were biased. Better focus on vibrations than temperature (as long as the latter is constant).


    If you want details see my discussion with 'nobe' in this comment thread: https://www.cnx-software.com/2…enchmarks/#comment-543191

    • Offizieller Beitrag
    Code
    Current Temperature:                    46 Celsius
    Power Cycle Min/Max Temperature:     46/47 Celsius
    Lifetime    Min/Max Temperature:     20/53 Celsius
    Under/Over Temperature Limit Count:   0/0

    These should be the actual measured temperatures.
    The other ones are limits from the specification, I think.

  • The other ones are limits from the specification, I think

    This depends on the disk in question or let's better say sometimes also on the firmware used inside USB3 enclosures (some firmwares are crap and interfere with the disk information).


    In @'utamav''s example with his WD Green in an WD MyBook enclosure there's only this available:


    Code
    Min/Max recommended Temperature:      0/60 Celsius
    Min/Max Temperature Limit:           -41/85 Celsius

    First line is recommendations, 2nd line is some values that have been recorded (and BS too since -41°C is somewhat impossible).
    In your example (which disk BTW?) it's way better since differentiating between 'Power Cycle' and 'LifeTime' thermal limits. So you know that until last power cycle your HDD operated at a constant temperature (great!) and even over the entire lifetime there happened nothing wrong (both thermal values and counter for exceeding the thermal limits are fine).

  • Sorry, was not clear in my post: the lines I posted are the lines 248-251 from the file utamav posted.

    Ah. Now I get it. Thank you. Wrote BS above and should really learn to get some coffee first prior to entering the internet ;)


    @utamav: Everything good with your thermal readouts. Even the lifetime thermal history looks perfectly fine.


  • I personally would not be happy with my disks running in the mid to high 40’s. But as you’ve said, there’s a difference in opinion on this subject. :)




    Sent from my iPhone using Tapatalk

  • I personally would not be happy with my disks running in the mid to high 40’s. But as you’ve said, there’s a difference in opinion on this subject.

    Well, feelings and opinions... https://en.wikipedia.org/wiki/…ilure#Metrics_of_failures (there is no evidence that warm HDDs die earlier compared to colder ones. This often cited study provides numbers that show even the opposite. But as usual: doing statistics correctly and differentiating between correlations and causations is difficult)


    If the drive vendor tells you 0-60°C are ok I would be fine with constant 46-47°C :)

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!