
A few months back, LA Computer Company (whose website is still up as I write these words, although it may no longer be when you read them), a retailer from whom I’d purchased a number of products over the years, announced that it was closing up shop and “fire sale-ing” its remaining inventory. I subsequently purchased several items from the company, one of which was a “Refurbished 4-Bay Portable Tower Enclosure 12TB (4x3TB)” further described as a “Refurbished 4 bay Thunderbolt 2 enclosure with 4 x 3TB hard drives installed.” The product photo (no longer available, alas) was generic, but the price was compelling, so I took a chance.
What arrived was a cosmetically imperfect but still functional AKiTiO Thunder2 Quad enclosure:
This last stock photo is particularly apropos, as the initial computer I intend to tether the enclosure (and HDDs inside) to is my own “trash can” Mac Pro:
And after the Mac Pro exits Apple’s supported-products stable, I’ll still be able to use the AKiTiO external storage device with newer Macs (along with Thunderbird-supportive Windows systems) in conjunction with an Apple adapter:
What of those HDDs inside the enclosure? They’re 7,200 RPM Seagate ST3000DM001 3.5” drives (here’s a PDF spec sheet for the entire Barracuda product family generation, code-named “Grenada”), with 6 Gbps SATA interfaces and 64 Mbyte RAM caches onboard. This particular variant integrated three 1 TByte platters, each with two associated read/write heads (one on either side), and also came in fewer-platter and lower-capacity versions.
I was initially surprised when Google search results on the product code revealed a Wikipedia page dedicated to the ST3000DM001, but all became clear when I started reading it. Suffice it to say that going with the “industry’s first 1TB-per-disk hard drive technology” more than a decade ago may have incurred at least some long-term usage risk for Seagate and its customers, in contrast the product family’s generally positive initial review results. Specifically, Backblaze, a well-known cloud storage company who uses lots of mass storage devices (both rotating and solid-state) and regularly publishes data on various drives’ reliability, found the ST3000DM001 exhibiting atypically high failure rates. Quoting from the company’s April 2015 report:
Beginning in January 2012, Backblaze deployed 4,829 Seagate 3TB hard drives, model ST3000DM001, into Backblaze Storage Pods. In our experience, 80% of the hard drives we deploy will function at least four years. As of March 31, 2015, just 10% of the Seagate 3TB drives deployed in 2012 are still in service.
Root cause? Here’s one working theory, according to German data recovery company Datenrettung (who was specifically discussing the drives’ usage in Apple’s 5th-gen Time Capsule):
The parking ramp of this hard drive consists of two different materials. Sooner or later, the parking ramp will break on this hard drive model, installed in a rather poorly ventilated Time Capsule. The damage to the parking ramp then causes the write/read unit to be destroyed and severely deformed the next time the read/write unit is parked. When the Time Capsule is now turned on again or wakes up from hibernation, the data disks of the Seagate hard drive are destroyed because the deformed read-write unit drags onto it.
Is Datenrettung right? Maybe. Some of my skepticism comes from the brutally honest “rather poorly ventilated Time Capsule” observation in the company’s comments. Apple has long been all about sleek, svelte, quiet, and otherwise boundary-pushing system design, and this isn’t the first time that a propensity for overheating has been the end result. Take my G4 Cube, for example. Or my first-generation MacBook Air. Or, more germane to this particular conversation, my own 3rd-gen Time Capsule, which also exhibited overheating-induced functional compromise but used an older, lower-capacity drive from an unknown manufacturer.
My skepticism further increased when I came across an excellent dissection at Tom’s Hardware:
By its own admission, Backblaze employed consumer-class drives in a high-volume enterprise-class environment that far exceeded the warranty conditions of the HDDs. Backblaze installed consumer drives into a number of revisions of its own internally developed chassis, many of which utilized a rubber band to “reduce the vibration” of a vertically mounted HDD.
The first revision of the pods had no fasteners for securing the drive into the chassis. As shown, a heavy HDD is mounted vertically on top of a thin multiplexer PCB. The SATA connectors are bearing the full weight of the drive, and factoring the vibration of a normal HDD into the non-supported equation creates the almost perfect recipe for device failure.
Backblaze has confirmed it still has all revisions of its chassis installed in its datacenters and that it replaced failed drives into the same chassis the original drive failed in. This could create a scenario where replacement drives are repeatedly installed into defective chassis, thus magnifying the failure ratio.
Backblaze developed several revisions of the custom chassis due to its admitted vibration problems with the early models, and the company shared the designs with the public. However, Backblaze did not indicate which type of enclosures each drive failed within, leaving speculation that the chassis may be the real root of the problem (among others).
The bolded emphasis in this last paragraph is mine:
The Backblaze environment employed more drives per chassis and featured much heavier workloads (both of which accelerate failure rates tremendously) than the vendors designed the client-class HDDs for. This ultimately helped Backblaze save money on their infrastructure. The Seagate 3 TB models failed at a higher rate than other drives during the Backblaze deployment, but in fairness, the Seagate drives were the only models that did not feature RV (Rotational Vibration) sensors that counteract excessive vibration in heavy usage models — specifically because Seagate did not design the drives for that use case.
So, to save cost, Backblaze went with HDDs that weren’t designed for this particularly demanding application. And when those HDDs failed at higher rates than those that were designed for that particularly demanding application, the company questioned the reliability of the HDDs instead of questioning its own procurement criteria (which, as Tom’s Hardware noted in February 2016, “was borne of necessity; it began during the Thailand floods when HDDs were excessively high priced”).
Supposedly, said Tom’s Hardware, “Backblaze issued numerous disclaimers about the applicability of the findings outside of its own unique (and questionable) use case.” Candidly. I’m not sure where those disclaimers appeared; I sure don’t see them within the report itself. Regardless, “the damage from the information dealt Seagate an almost immeasurable blow in the eyes of many consumers.” And that, I’ll frankly proffer, is profoundly unfair. The courts, who tossed out a class-action lawsuit subsequently filed by one complainant, apparently concurred.
For what it’s worth, all four of my Seagate 3TB HDDS are seemingly working just fine so far. They came pre-configured, formatted HFS+ and in a clever performance-plus-reliability RAID combo:
- Each pair configured RAID 0 “striped” (for performance), with
- Both pairs then combined via RAID 1 “mirrored (for reliability)
Undoing all this upfront configuration (which admittedly did have the advantage of relying solely on the software RAID 0/1 facilities already built into MacOS) was a bit tricky, but I accomplished it. I’ve now got an APFS-formatted, RAID 5-configured array via SoftRAID (now owned by Other World Computing, who coincidentally also acquired AKiTiO a few years ago). And although the intermediary Thunderbolt-to-quad-SATA translation hardware would normally make it infeasible to assess HDD health via ongoing S.M.A.R.T. monitoring, SoftRAID neatly manages this bit (maybe, more accurately instead worded, “these bits”?), too.
HDDs are, as my own teardown showcases, complicated pieces of hardware-plus-software. That they work at all, far from reliably for many years, validates my August 2022 observation that they’re “amazing engineering accomplishments”:
- One or (usually) multiple platters, spinning at speeds up to 15,000 RPM. Each platter mated to one or (usually) two read/write heads, hovering over one or both sides of the rapidly rotating platter only a few nanometers away, and tasked with quickly accessing the desired track- and sector-stored details.
- Low-as-possible power consumption and high-as-possible ruggedness and reliability, in contrast to other contending design considerations.
- And ever-more data squeezed onto each platter, thanks to PRML (partial-response maximum-likelihood) sensing and decoding and now-mainstream PMR (perpendicular magnetic recording), next-generation SMR (shingled magnetic recording) and emerging successor HAMR (heat-assisted magnetic recording) storage techniques.
But, in order for them to work reliably for many years, they need to be used as intended. Backblaze seemingly didn’t do so. Was an inherent compromise in Seagate’s design at least partly to blame? Maybe. Reiterating what I said earlier, the ST3000DM001 and its product-family siblings marked Seagate’s initial entry into the 1 TByte-per-platter domain. Ironically, the Hitachi HUS724030ALE641 HDD I tore apart nearly two years ago, which dated from April 2013, was also a 1 TByte/platter design.
But that wasn’t the Hitachi HDD that Backblaze compared the Seagate ST3000DM001 against. It was the much older HDS5C3030ALA630, which not only required 5 platters (and 10 read/write heads) to achieve that same total-capacity metric, but also only ran at 5940 RPM rotational speeds. When you unwisely try to compare apples and oranges, you undoubtedly encounter variances. And in summary. I guess that’s my guidance to all of you: be wise. Don’t be fooled by sensationalist clickbait, whether related to technology, politics, or anything else, that presents you with a cherry-picked subset of the total applicable dataset in attempting to persuade you to accept a distorted conclusion. Question your own assumptions? Yes. But also question others’ assumptions. As well as their underlying motivations. I welcome thoughts in the comments!
—Brian Dipert is the Editor-in-Chief of the Edge AI and Vision Alliance, and a Senior Analyst at BDTI and Editor-in-Chief of InsideDSP, the company’s online newsletter.
Related Content
- Peeking inside an HDD
- Question your assumptions: Diagnosing computer disruptions
- HDDs vs SSDs: It’s all about the random speeds
- Power vs energy: SSD and HDD case studies
The post Fairly evaluating HDD reliability appeared first on EDN.