Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Makosuke

macrumors 604
Original poster
Aug 15, 2001
6,826
1,565
The Cool Part of CA, USA
I had something weird happen with my Mac's external SSD storage that isn't of particularly broad value, but even though it's a long story I thought it was at least interesting enough to document in case someone else also found it interesting.

Setup is a last-gen Intel iMac with a OWC Express 4M2 containing four 1TB SSDs hooked to it. The SSDs two Samsung Evo 970s that are about a year old and two WD Blacks that are about two years old. I've got them in a RAID 4 config using SoftRAID, version 7.0 when all this happened.

The external lives in a cabinet near the computer to reduce annoying noise, with the feet sitting on styrofoam spacers so the vibration doesn't resonate with the shelf. I carefully monitored temperatures inside the cabinet when I set this up, which are a little warmer than the room but everything is comfortably within tolerances and it's been fine for a while like this. The drive sensors usually report 35-40C when under load, never much higher than that.

...until, over the course of a year, the styrofoam gradually compressed itself and ended up blocking the fan intake on the case (in hindsight, I should have had it vertical to eliminate the chance of this). This, coupled with doing a full-drive clone in prep for a long trip, overheated the SSDs (I think they sat at around 80C for a while while doing a full read). Normally I'd expect either the SSDs to work, work slowly, or straight-up fail, but the partial data corruption that actually happened at that point was really unexpected.

Both of the Samsung drives started throwing read fail errors in SoftRAID, while both of the WDs seemed to be fine. After a bunch of fiddling around and rebuild/scrub attempts, the array could not be recovered from parity data (two bad drives), but it was mostly still readable albeit mounting read-only due to directory corruption. I had backups and also copied the corrupt array to another disk just in case, then figured the drives were toast and hoped I could replace them under warranty.

Doing a full hash compare of all files on the corrupt array clone versus my two backups, very little data was actually lost; two folders with a few dozen files each ended up empty, and about a half dozen files (out of hundreds of thousands) were slightly corrupted on the bad array and the more recent backup (the corrupt versions were all video and image files, and surprisingly were still usable, albeit with slight glitches; none had been written in months and were fine in the older backups, so the corruption was definitely due to the overheating, and definitely of already-written data).

What surprised me though was what happened when I went to gather evidence of the drive failure to try and do a warranty claim: I wiped all four SSDs then did a multi-pass read-write certification on them using SoftRAID. I expected at least the two that were throwing read errors to give errors on the certify... but they came up perfectly healthy. I re-ran the tests, and even with multiple random-data full-drive write-read cycles, absolutely no problems storing data, no SMART errors, full speed, full functionality.

As best as I can figure, the overheating caused small data errors in a few blocks on the SSDs, but didn't actually damage the SSDs themselves, so once they had been erased (and probably also removed some bad blocks from use), they were fine. I did not realize that could happen, much less from overheating conditions. I was also surprised that, once the corrupt data was erased, the drives themselves considered everything absolutely fine.

If I were paranoid I'd take them out of circulation, and they've probably going to suffer from reduced lifespan, but I've decided that for my use case and with the backup strategy I have it's probably fine. In any case, I'm not looking for advice on what to do, just figured I'd share the experience.
 
  • Like
Reactions: f54da
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.