Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

chambord

macrumors member
Original poster
Aug 28, 2012
77
1
Let's pretend that you want to reliably backup 10 TB of data. How would you build such a system? Speed is somewhat important, but redundancy and reliability are the most important.

I have been considering something along these lines:

2 x 5 drive, RAID5 arrays with 2TB WD Caviar Black drives in them.
The first raid5 array will self-duplicate to the second raid5 array.

That way, the raid5 array has redundancy in that if one drive fails, the array can self recover. And, if the entire first (or second) raid5 array fails, then you still have a copy of it. So this gives redundancy within the raid5 array itself, and it gives redundancy between the raid arrays.

However, this does not protect against a catastrophic event because there is no offsite storage. Also, it does not protect against bit corruption (bit decay?)

To actually do this on a Mac, I would use a Thunderbolt -> eSATA adapter, and I would use 2 5-drive RAID5 enclosures that have the raid controller and port multipliers built in. So the Mac would see two drives, and that would be that. An alternative way would be something like two thunderbolt pegasus raid5 setups, but that's pretty expensive. I believe it comes out to around twice the cost of my proposed setup, but it's really good hardware.

Examples of proposed hardware:
From OWC:
http://eshop.macsales.com/shop/hard-drives/RAID/Rack_Mount/
From DatOptic:
http://www.datoptic.com/hardware-raid-five-5-bay-1u-rackmount-esata-interface-rm5-r5.html

And of course the LaCie Thunderbolt -> eSATA adapter.

But.. If you were tasked to do this, how would you do it? How would you then solve the offsite backup and bit corruption problems? Things like Crashplan+ would take almost a year to backup to, and they only provide 1TB seed drives.

Just curious for some input!
 
Let's pretend that you want to reliably backup 10 TB of data. How would you build such a system? Speed is somewhat important, but redundancy and reliability are the most important.

I have been considering something along these lines:

2 x 5 drive, RAID5 arrays with 2TB WD Caviar Black drives in them.
The first raid5 array will self-duplicate to the second raid5 array.

WDC says Black drives aren't suitable for RAID, now pushing people with small arrays to Red drives, and larger arrays to RE series drives.

This is a backup system? What does the primary data storage system look like?

That way, the raid5 array has redundancy in that if one drive fails, the array can self recover. And, if the entire first (or second) raid5 array fails, then you still have a copy of it. So this gives redundancy within the raid5 array itself, and it gives redundancy between the raid arrays.

RAID 5 with either large disks or a large number of disks (and 2TB x 5 approaches the fine line between this) you actually have a good chance of losing the entire array in the face of a lost drive. It will take somewhere around 6-12 hours to rebuild the array, depending on whose RAID implementation you use. During that time either another drive could die, or much more likely there is a sector read error or parity error (could be either lost sectors or corrupted data not caught by or incorrectly reconstructed by, drive's ECC).

However, this does not protect against a catastrophic event because there is no offsite storage. Also, it does not protect against bit corruption (bit decay?)

You want NAS 4 Free or Nexentastor for the primary storage and have it build clones. To get corruption mitigation, it needs to happen upstream on primary data otherwise the benefit is significantly less if you're using resilient storage as a backup for potentially already corrupt data.


To actually do this on a Mac, I would use a Thunderbolt -> eSATA adapter, and I would use 2 5-drive RAID5 enclosures that have the raid controller and port multipliers built in. So the Mac would see two drives, and that would be that. An alternative way would be something like two thunderbolt pegasus raid5 setups, but that's pretty expensive. I believe it comes out to around twice the cost of my proposed setup, but it's really good hardware.

No I would not do this. You are talking about a large array that will take hours to days to rebuild should anything go wrong. And DAS increases the risk of problems. This storage should be NAS based. If you need more speed, pay for 10GigE. Or create a smaller RAID 0 with smaller but fast HDDs or an SSD and use those for fast local storage but keep most of the data on network storage.

But.. If you were tasked to do this, how would you do it? How would you then solve the offsite backup and bit corruption problems? Things like Crashplan+ would take almost a year to backup to, and they only provide 1TB seed drives.

If you want off-site you have to pay for it in one form or another. A duplicate NAS isn't that expensive and you can do the initial sync locally which will be much faster. And then locate the NAS elsewhere and sync with rsync.
 
Do you particularly want to build your own RAID NAS ? I would just buy 2 decent off-the peg units and populate them with 3TB drives.
 
Already done.
My Synology NAS is 10TB in RAID 5, which is backed up onto the expansion unit directly connected to it.
The most important data is also backed up to external disk.
 
I have been considering something along these lines:

2 x 5 drive, RAID5 arrays with 2TB WD Caviar Black drives in them.

No, no and no.

You shouldn't really rely on identical drives, that will significantly increase the likelihood of multiple failures within a short timespan. If you insist on using the same models (performance reasons, convenience, economy, etc.), at least make sure the individual drives are from different production batches.
 
No, no and no.

You shouldn't really rely on identical drives, that will significantly increase the likelihood of multiple failures within a short timespan. If you insist on using the same models (performance reasons, convenience, economy, etc.), at least make sure the individual drives are from different production batches.

I had this problem with some 1TB Samsung units. I solved it by getting drives from distributors in 3 different countries.
 
No, no and no.

You shouldn't really rely on identical drives, that will significantly increase the likelihood of multiple failures within a short timespan. If you insist on using the same models (performance reasons, convenience, economy, etc.), at least make sure the individual drives are from different production batches.

That's how Promise sell all there Pegasus stuff..You will get the same brand of drive but the batch serials are always different...I have Hitachi's in mine now, after one Seagate failed...I just asked them d=for a replacement drive, but they insisted on a full replacement...No issues since.
 
I had this problem with some 1TB Samsung units. I solved it by getting drives from distributors in 3 different countries.

Good solution, but it will unfortunately decrease one's chances of getting nice deals, won't it?

That's how Promise sell all there Pegasus stuff..You will get the same brand of drive but the batch serials are always different...I have Hitachi's in mine now, after one Seagate failed...I just asked them d=for a replacement drive, but they insisted on a full replacement...No issues since.

Good to hear! Always nice when a company does things right for a change. :p
 
Good solution, but it will unfortunately decrease one's chances of getting nice deals, won't it?

Not really. As it turned out the price was pretty much the same. I had thought of buying from Amazon in 3 different countries to avoid shipping costs, but the source is just from one set of stock, so I avoided this.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.