On Linux, you can use smartmontools to read the SMART data of a drive; see this good introduction to reading SMART data.
Note that, if your device is behind a USB controller, you must not use the “-d ata” argument that you normally need to access SATA disks accessed with libata (as per “man smartctl”). If you add “-d ata”, the tool will complain “Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)” and not read any health information or change any setting, even with “-T verypermissive”.
Interpreting SMART attributes in general
Every of the SMART attributes has several columns as shown by “smartctl -a <device>”:
- ID: The ID number of the attribute, good for comparing with other lists like Wikipedia: S.M.A.R.T.: Known ATA S.M.A.R.T. attributes because the attribute names sometimes differ.
- Name: The name of the SMART attribute.
- Value: The current, normalized value of the attribute. Higher values are always better (except for temperature for hard disks of some manufacturers). The range is normally 0-100, for some attributes 0-255 (so that 100 resp. 255 is best, 0 is worst). There is no standard on how manufacturers convert their raw value to this normalized one: when the normalized value approaches threshold, it can do linearily, exponentially, logarithmically or any other way, meaning that a doubled normalized value does not necessarily mean “twice as good”.
- Worst: The worst (normalized) value that this attribute had at any point of time where SMART was enabled. There seems to be no mechanism to reset current SMART attribute values, but this still makes sense as some SMART attributes, for some manufacturers, fluctuate over time so that keeping the worst one ever is meaningful.
- Threshold: The threshold below which the normalized value will be considered “exceeding specifications”. If the attribute type is “Pre-fail”, this means that SMART thinks the hard disk is just before failure. This will “trigger” SMART: setting it from “SMART test passed” to “SMART impending failure” or similar status.
- Type: The type of the attribute. Either “Pre-fail” for attributes that are said to indicate impending failure, or “Old_age” for attributes that just indicate wear and tear. Note that one and the same attribute can be classified as “Pre-fail” by one manufacturer or for one model and as “Old_age” by another or for another model. This is the case for example for attribute Seek_Error_Rate (ID 7), which is a widespread phenomenon on many disks and not considered critical by some manufacturers, but Seagate has it as “Pre-fail”.
- Raw value: The current raw value that was converted to the normalized value above. smartctl shows all as decimal values, but some attribute values of some manufacturers cannot be reasonably interpreted that way (for the details: Wikipedia: S.M.A.R.T.: Known ATA S.M.A.R.T. attributes).
Reacting to SMART values
It is said in this good introduction to interpreting SMART values that a drive that starts getting bad sectors (attribute ID 5) or “pending” bad sectors (attribute ID 197; they most likely are bad, too) will usually be trash in 6 months or less. The only exception would be if this does not happen: that is, bad sector count increases, but then stays stable for a long time, like a year or more. For that reason, one normally needs a diagramming / journaling tool for SMART. Many admins will exchange the hard drive if it gets reallocated sectors (ID 5) or sectors “under investigation” (ID 197) [source].
For the more detailed meaning of the individual attributes, see again Wikipedia: S.M.A.R.T.: Known ATA S.M.A.R.T. attributes. Also, there are many peculiarities of SMART attribute interpretation, depending on manufacturer, series and model, so if you have a strange case search for SMART attribute values from same and similar disks which others posted on forums. People even say that it takes a good deal of “intuition” to interpret them … .
The most important thing to know about SMART values is probably this conclusion from a Google Labs team:
“We find, for example, that after their first scan error [any kind of bad sector error, caused by read instability or media damage], drives are 39 times more likely to fail within 60 days than drives with no such errors. First errors in reallocations, offline reallocations, and probational counts are also strongly correlated to higher failure probabilities. Despite those strong correlations, we find that failure prediction models based on SMART parameters alone are likely to be severely limited in their prediction accuracy, given that a large fraction of our failed drives have shown no SMART error signals whatsoever.” [Failure Trends in a Large Disk Drive Population, p. 13]
Leave a Reply