Yeah, it's not the same - I agree with that. But is it more or less? There's good math to support both ideas. The only thing that remains the same is that if a single drive fails catastrophically you're more likely in a RAID 0 set to lose more data as a RAID 0 set will typically be larger. Yet the inverse is true if we consider a RAID 0 of 4 250GB drives (or 3 300GB drives) as compared to a single 1.5 or 2.0 TB drive so even that isn't a fast hard fact.
Now you're getting into the realm that drives me nutz. Platter density & platter count that make up a drive. UBE rates are becoming too low these days, as either too many drives, too many platters, and high density all can become problems. Ye olde multiple failures bit, especially those that occur during a rebuild.
RAID 5 & even 6 now are affected by this more substantially with each capacity increase. Most of what I see, simplifies it down to capacity. Such as the recommendations of keeping a RAID 5 at a max capacity of ~10TB or so. It's a little too complicated to relate it to drive count. As you noted, the drives themselves do matter, and typically consist of 1 - 4 platters. Then there's the density. 250MB, 333MB per platter are common now. Thinking about it can cause headaches.
Yeah, in fact, when you begin to actually do the math, investigate and work in the actual statistics you realize early on that it's such an inaccurate model that it really doesn't even apply. We might as well just pick a number from a hat.
Seems that way to me as well.
Yeah, that would work. But then unless we use the exact same parts in exactly the same conditions it kinda kills the usefulness of doing it at all. Right? I mean the whole thing is about trying to come up with a predictive model so that we can compensate in advance. After the fact it's a little late.
Experimentation is handy, but waiting until a failure to occur, is too late for using it as a predictive means for replication. By the time it's done, the equipment may not even be made anymore.
And for creating a model, it wouldn't be accurate either, as there's too many hardware and software variations. Perhaps as a rule of thumb, but it won't hold long, as the technology is always changing.
Yup. Hehehe, that's the size of the margin of error we're dealing with.
Not a very good predictive model is it?
But yeah if we get past the 1st 3 to 6 months then typically we're safe for 3 years. The following 2 years it becomes increasingly dangerous.
It's all we've got though.
The last 2 years are a gamble. Less with SAS IMO, but it's not eliminated. Hmm... maybe
that's the reason for implementing Replacement Policies.
I assumed this is what you meant, but wasn't absolutely sure. So I played it safe.
So if I create a stat that calculates number of R/W operations for the life of the drive it will vary greatly as I vary the data size. An extreme example would be a 32KB file as opposed to a 320GB file. Mean R/W operations before failure.
Don't forget the stripe size either.
Then you've got the CRC implementation to consider, firmware, feedback circuits,.... is there no end?
(I had to include some hardware here somewhere). Couldn't help myself.
If I write as many as I can in 1,000,000 hours of each the wear will be profiled differently. Additionally the same comparative operation distributed over 4 drives means that each drive will be used less given the same model drives were used in all instances.
Usage patterns. Say it isn't so.