Raid10 Drive Selection

Discussion in 'Mac Pro' started by sn1p3r845, Aug 19, 2012.

  1. sn1p3r845 macrumors regular

    Joined:
    Feb 9, 2012
    Location:
    Vancouver, BC
    #1
    I'll keep it short:

    Building a 8 Bay Raid10, and I can't decide what drives to fill it with. With the new WD Red's they seem tempting, but I havn't been able to really find much information on them compared to other suitable drives.

    What drives would you use today (budget ~$200/drive, 2tb or 3tb)

    Western Digital Red 3TB SATA3 64MB Cache 3.5IN Internal Hard Disk Drive HDD
    http://ncix.com/products/?sku=74269

    Alt Option: Western Digital 2TB RE4
    http://ncix.com/products/?sku=52000&vpn=WD2003FYYS&manufacture=Western Digital WD

    Worflow is for video editing Red Epic 5k RAW format.
     
  2. jav6454, Aug 19, 2012
    Last edited: Aug 19, 2012

    jav6454 macrumors P6

    jav6454

    Joined:
    Nov 14, 2007
    Location:
    1 Geostationary Tower Plaza
    #2
    I would recommend the RE drive from WD then. Although the 2TB or 3TB are pricey.

    Here is a link to ---->

    WD RE4-GP: Link for $229/piece

    WD-RE4: Link $219/piece

    WD-Red: Link for $169/piece

    **** Note: These are for 2TB drives, 3TB drives will be much more expensive obviously. The difference between RE4-GP and RE4 is that the -GP drive is greener and more energy efficient. ****

    I would suggest the RE drives as those drives are specifically designed, geared and tested for RAID array usage.

    Amend - Although NAS is also a type of RAID, NAS is for external storage space that is attached, where as RAID is more of an inner type of system. (Inside the computer). In essence all that varies is the main purpose (backup/storage v speed and reliability) and position/placement (inside/outside) in a computer system.
     
  3. sn1p3r845 thread starter macrumors regular

    Joined:
    Feb 9, 2012
    Location:
    Vancouver, BC
    #3
    See the thing is, the RE4's are 3.0Gb/s where the Red's are 6.0Gb/s
     
  4. jav6454 macrumors P6

    jav6454

    Joined:
    Nov 14, 2007
    Location:
    1 Geostationary Tower Plaza
    #4
    I'll tell you this much, even under RAID 10, you won't make a mechanical drive break 1.5Gb/s in transfer speeds. It is like having a one way 8 lane highway when the traffic only needs 2 lanes the same way.

    What makes speeds pick up is having more drives look simultaneously for the same information. That will pick up speed. But a faster interface will not affect that; unless all drives are SSDs...
     
  5. sn1p3r845 thread starter macrumors regular

    Joined:
    Feb 9, 2012
    Location:
    Vancouver, BC
    #5
    that's what i kinda figured. is there any solid reason why to not get the Red's over the RE4's other than what you had already said? The Red's are also "specifically designed" for Raid.
     
  6. jav6454 macrumors P6

    jav6454

    Joined:
    Nov 14, 2007
    Location:
    1 Geostationary Tower Plaza
    #6
    From what I have read so far, Reds are more user configurable and have a better power efficient technologies. So in the end it is more of a "do I want to have certain control over my drives" or just "buy, install, run".

    Anandtech Article

    They have a nice review of Red drives and do recommend them for NAS. Do remember, NAS is just RAID HDDs... So the real question is now, do you want an internal RAID array or a NAS style set up in your Mac Pro?
     
  7. sn1p3r845 thread starter macrumors regular

    Joined:
    Feb 9, 2012
    Location:
    Vancouver, BC
    #7
    I have an 8 bay external enclosure that'll connect via Mini-SAS
     
  8. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #8
    RE4 is the better way to go (they're also faster than the Reds). Reds are better suited to backup/archival purposes (speed is on par with Greens, but they are more stable than Greens).

    Its not even possible to saturate 3.0Gb/s SATA ports with mechanical drives, so don't worry about it.

    You'll need to stick to 2TB models.
     
  9. deconstruct60, Aug 20, 2012
    Last edited: Aug 20, 2012

    deconstruct60 macrumors 604

    Joined:
    Mar 10, 2009
    #9
    But are the 5K RAW files the archives?
    The beginning of the workflow are RAW files but it seems very unlikely the end of the workflow will be 5K RAW files. In a non-destructive editing context, the inputs are effectively an archive.


    Perhaps a different workflow but seems likely the the RAW files will be effectively a WORM " write once read many" set of files. All the mutations will be done on intermediary forms and the output will be different even still.
    Those intermediary and final files could be sent somewhere else.

    For example with a 8 bay enclosure:

    One set of 4 3TB REDs in RAID 10 ==> 6 TB usable storage
    One set of 4 2TB REs in RAID 10 ==> 4 TB usable storage

    Total across both 10 TB usable storage. Could even make the RE's shorter stroked or pass up the capacity or go to Velocirators (or some other 10K rps drive ) since bulk capacity isn't the issue anymore.

    Another example: a 3 stripe RAID 10 set

    Hall of the 8 bay on path 1 3 active in 10 striping + hot backup
    Half of the 8 bay on path 2 3 active in 10 striping + hot backup

    And the intermediate and final output sent elsewhere. With 2TB REDs all around that's still 6 TB usable but faster and incrementally more bullet-proof.



    For a single drive. For multiple 10K drives yes it is. If the multiple drive feed into a single connection (raid controller ) they the aggregated output is more. For smooth latency issues probably don't want 6GBps on one side of the controller and 3GBps on the other. There are times in the process of transferring data that the drive controllers are talking to the SATA controllers and the data on disk isn't the primary issue.
     
  10. wonderspark macrumors 68040

    wonderspark

    Joined:
    Feb 4, 2010
    Location:
    Oregon
    #10
    Just a note to mention my set of eight RE-4 HDDs via two mini-SAS to Areca 1880ix-12 in my Mac Pro achieves throughput rates of 1101MB/sec in RAID 0 both read and write, and 816MB/sec write, 714MB/sec read in RAID 6.

    I'd be curious to see what the same setup would get from eight 3TB Reds, but one more point in favor of RE-4 is the 5-year warranty over the 3-year on the Reds.
     
  11. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #11
    I suspect this will be for the primary volume, which will share both 5k and finished files (post edit content). Rather common in general for editors in my experience.

    So it would need to be capable of high speed and reliability. As wonderspark mentioned, the Reds don't have as long of a warranty, which generally speaking, tends to end up as a bit less reliable in the real world (they're also slower). No idea if the OP can keep up with a 36 mo replacement schedule for reliability purposes, even if capacity isn't an issue (money usually is, so equipment may be left in service longer that it should be).

    Now there's been no mention of a proper RAID controller either, so the OP may actually intend to use nothing more than SATA cards and PM boxes, hence the mention of 10 rather than a parity based level.

    In such a case, the speed will be throttled due to the card/s (tend to use less than 1x PCIe lane per SATA port) & PM chips, so the Reds might be an acceptable solution in such a configuration, particularly if this ends up only being as an archival pool.

    But if it's for primary storage use (RAW + edited content), it wouldn't be up to the throughputs necessary.

    So I suspect a proper RAID card will be needed, and if it's not a true 1:1 ratio for disks to ports (i.e. Areca's 188x series for example <8 ports native, SAS Expander on-board used past 8>, it's still capable of very high throughputs suitable for the task). Others, such as ATTO, still offer a true 1:1 on the card itself (SAS Expanders in the configuration would be external to the card <separate box or as part of the HDD enclosure>).

    This is possible, but I truly suspect that the storage pool will be used for both (RAW + edited files, intermediate through final), unless the OP states otherwise (usually the result of cost constraints).

    If it's using standard SATA ports (1:1 ratio), then throttling per disk won't be an issue. If on a RAID card, it won't be an issue either.

    IF Port Multipliers are involved however, Yes there will be a significant reduction in speed (i.e. cheap controllers, assuming 2x, each on a 1x PCIe lane, then it would be good for no more than ~500MB/s). Which wouldn't be usable for the intended purpose.

    But I'm giving the OP both the benefit of the doubt (that PM boxes won't be used), and time to respond to the thread for additional specifics.

    The point of this particular post, is to give the OP something to consider that may not have been, so it's a worthwhile discussion IMHO. ;)
     
  12. sn1p3r845 thread starter macrumors regular

    Joined:
    Feb 9, 2012
    Location:
    Vancouver, BC
    #12
    I'll be running the Raid10 with an ARC-1223X card.
    http://www.arecaraid.com/product_info.php?cPath=&products_id=110

    The raid will house all my RAW files and editing files, basically functioning as my working drive.
     
  13. deconstruct60, Aug 20, 2012
    Last edited: Aug 20, 2012

    deconstruct60 macrumors 604

    Joined:
    Mar 10, 2009
    #13
    If the software requires everything to be in one huge pile then probably should be looking at better software. If the software doesn't require this then it is an assumption worth being challenged. Especially in the context to where folks moving to editing systems which are non destructive and the "originals" pile is an every growing storage capacity problem.


    If constantly dumping 5K RAW files into the same value it is some doubtful will still be under the capacity limits the drives in 36 months. If going to need new drives for the additional 2-3TB/year racking up into the humongous pile 5 years isn't going to make a difference. Will be buying new, denser storage drives anyway.


    If the target price is $200 then 4 drives at $160 allow you to buy a different set of 4 drivers at 240 for the same price as 8 drives at $200. For example there is about a $80 swing between the 3TB RED and the 2TB RED. $280 can get you an 1TB velocityraptor .


    Reading a stream and concurrently writing a derivative stream back to the same set of disks will perform far more like random I/O disk access than stream single large sequence file access. Some people try to solve that by cranking up the RAID stripe width wider. Another is to just read/write the streams to different places. The root cause of the solution is using more disk spindles. They don't have to be all in the same set.
     
  14. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #14
    Not unexpected. :)

    I have to ask though, why RAID 10 on a proper RAID controller?

    You might want to take a closer look at parity levels, which that card would be really good at (faster than 10, and make more capacity available to you).

    It's not a limitation of the software, but rather the budget (insufficient funds for separate pools).

    If you look at the post above yours, you'll notice that the OP responded with what I expected; that the storage pool will be used for both RAW + edited content.

    Due to the cost issues, there's a limit as to how many spindles will be possible, in this case, the OP is interested in an 8 member volume.

    With more money, multiple volumes constructed of separate drives would be far better. But budget restrictions usually tend to prevent this (i.e. separate RAID volumes on the same card, each with it's own physical members).

    As per capacity growth rate, it can be reduced by deleting RAW data that's no longer in use. But unless the final edits can be deleted (ideally archived to a separate pool, then deleted), then the available capacity will shrink as a result of increased archival storage on the primary pool.

    Again, budgets. Sucks, but it's something that cannot be ignored. :(
     
  15. sn1p3r845, Aug 20, 2012
    Last edited: Aug 20, 2012

    sn1p3r845 thread starter macrumors regular

    Joined:
    Feb 9, 2012
    Location:
    Vancouver, BC
    #15
    What raid level do you recommend then?

    I'll be offloading projects after theyre complete on other drives for backup.

    I was originally going to do raid5 but ended up changing my mind
     
  16. thekev macrumors 604

    thekev

    Joined:
    Aug 5, 2010
    #16
    Some people have already commented, but in your situation either drive type should be a good match in that they're both quality drives and fully appropriate for RAID configurations. RE4 drives are considered enterprise grade. They are appropriate for your uses from a reliability standpoint. I don't see any reason why they'd be inferior to RED there. I would mention that RED is just branding here.


    I am envious of your rig.
     
  17. deconstruct60 macrumors 604

    Joined:
    Mar 10, 2009
    #17
    Parity RAID is not faster than 10. What is typically faster is going to a wider stripe. With a fixed number of disks typically folks compare a RAID 5 (or 6) with a wider stripe width to a RAID 10 with one less. What is being measured is the difference in stripe width not the RAID encoding.


    It isn't budget as much as organization.


    There is nothing that requires a storage pool to be a single volume.


    Which typically means shuffling off to more disks and another volume. There can still be a split between "near line" archiving and "storage locker" archiving. RAWs are awkward to work with (largely due to their great bulk) , but don't want them too far away (may need to reprocess them to get to a new set of intermediates ).

    The nearline archive in RAID 10 ( largely for faster rebuild speed if necessary) and a intermediate volume in RAID 5 or 6 with more expensive (faster) components would work along side that.

    Part if the hidden agenda here is the make all of the drives the same kind so that could possibly rejigger into different encodings later. That's fine but it won't optimize speed , expense , and/or capacity.
     
  18. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #18
    5, 6, or 50 would be faster than 10 using 8x drives (worst case, gives you 6x disks worth of speed, best is 7x, while 10 is only 4 disks worth, particularly during writes, which you'll be doing a lot of). There are variations on redundancy, so read up on the levels carefully (6 would be the safest <can handle 2x disks failures without losing the data>, followed by 50 <can take 1x disk per level 5 and still retain data, but if 2x go on either 5, your data's done>, then 5 for the entire set <2x disks go, and you're data is gone>).

    If you go and look at controllers since say 2006 (performance testing), you'll notice that with the same disk count and stripe size, 5 is faster than 10.

    It's not that hard to fathom IMHO, as the throughputs have more drives to provide it via increased parallelism vs. 10, where half of the members are "lost" due to parity; especially so during writes (the silicon on the controllers has drastically increased it's speed, so the parity calculations can now be performed fast enough that the additional parallelism can be leveraged).

    Where 10 will have an advantage, is in rebuild times. But that's not nearly as necessary in this case as the day to day performance increase that parity based arrays can offer for the same member count is more important in this instance.

    For this particular industry, I've found the budget to be the dictating factor over organization.

    Of course not. But the budget and thought patterns of such users tend to make it a reality.

    Specifically,
    1. They tend not to have the funds for multiple pools on a hardware controlled RAID system of sufficient throughput for their needs (independents, SOHO, or very small SMB's that don't have much capital - think about the budget in SAN thread we've posted in recently).
    2. They also tend to still think like consumer users until/unless they're shown other possibilities. Unfortunately, even if they realize this and how it can benefit them, the budget limitations tend to prevent this on a suitable scale for their specific needs (do see it with say an inexpensive SSD used for Photoshop scratch space, but not a pair of hardware controlled RAIDs, one for RAW files + separate pool for working data).

    Ideally I agree.

    Unfortunately, this would take 2x the storage system that's already been mentioned, which nearly doubles the budget (IIRC, the ARC-1223 cannot handle SAS Expanders, so a different card would be needed).

    So compromises have to be made, which usually works out that the best compromise is to share the pool for both RAW files and working data. Not ideal, but workable within the budget in most cases I've seen.
     
  19. wonderspark macrumors 68040

    wonderspark

    Joined:
    Feb 4, 2010
    Location:
    Oregon
    #19
    Given eight identical disks that yield 138MB/sec each in a RAID 10 vs RAID 6, which is what we're talking about here, I say RAID 10 will max out at ~550MB/second, because you can only be writing to four of the eight disks at once.

    My RAID 6 already proves faster than that in both reads and writes, and ANY TWO disks can fail without data loss. Further, I have 12TB of usable space, whereas RAID 10 only gives 8TB using the same hardware.

    I really don't understand why people still think RAID 10 is better. Less space, less speed, less redundancy. I've pulled two of my eight drives while editing DSLR footage to test the system, and it kept on going fine. Pull both drives in RAID 1 of the RAID 10 set and you've lost all your data.
     
  20. sn1p3r845, Aug 21, 2012
    Last edited: Aug 21, 2012

    sn1p3r845 thread starter macrumors regular

    Joined:
    Feb 9, 2012
    Location:
    Vancouver, BC
    #20
    Alright

    I ordered 9x WD RE4 2TB's yesterday, and I'll do Raid5 or Raid6 (is raid6 faster?).

    Thanks for the help guys :)

    oh btw, this is my enclosure.
    http://store.sansdigital-shop.com/totr8baysass8.html
     
  21. JavaTheHut macrumors 6502

    JavaTheHut

    Joined:
    Aug 15, 2010
    #21
    Note: Those RED drives have only been tested in 5bay NAS's ? if that makes any diff? - Please post back your findings I was thinking about trying these out soon also in a 8bay.

    GdLk
     
  22. sn1p3r845 thread starter macrumors regular

    Joined:
    Feb 9, 2012
    Location:
    Vancouver, BC
    #22
    ****, typo..

    I meant to say WD RE4's.
     
  23. wonderspark macrumors 68040

    wonderspark

    Joined:
    Feb 4, 2010
    Location:
    Oregon
    #23
    I have the same enclosure, but bought a different RAID card separately.

    RAID 6 is like RAID 5 with one additional parity drive, so it allows two drives to fail instead of only one. With RAID 5 x8 disks, you effectively have seven disks of data space and speed, with one parity disk. With RAID 6, you have six disks of data space and speed, with two parity disks. In both, parity is written to all eight disks, so any of them can fail, and the replacement will rebuild. I prefer the extra redundancy of a second disk, and the speed meets my needs, so I chose RAID 6.
     
  24. deconstruct60 macrumors 604

    Joined:
    Mar 10, 2009
    #24
    There is nothing in RAID 10 that prohibits being limited to four. You have to use more disks but can have a higher strip level and still be a R10 implementation. What is being touted as "faster" has nothing to do with RAID. A JBOD array with 6 volumes and 6 pulls from each of those volumes would aggregate to a larger number than both of those.


    Unless select all the disks from same manufacturing batch that is a relatively low occurrence. Usually, that is about as likely as getting a failure to some other component in the system. For example having 4 or more disks on a single connector.

    It also becomes a factor when the rebuild times are extremely long...... which is exactly this case since with a 6+2 set up you need pull data from 6 drives to reconstruct the 2. At 2TB a piece that is on the order of 12TB of data to grab.


    System wise the RAID5/6 also has less redundancy. As I pointed out can put the mirrored components of a RAID 10 arrive on different paths. If the RAID controller support dual components that is an additional redundancy don't have if layer the single set across two (or more) connection paths.

    RAID 10 can either checksum the read data on the fly or "cheat" to squeeze out faster writes. In the first case both mirrored blocks and read to confirm they are same before being sent on. Parity raid can't do that at "real time" speeds. In that state the data is constantly being checked for bitrot ( similar to ZFS and other systems that put a premium on data maintenance ).

    In the second case, for long streaming sequences (and long random request queues) a smart controller can effectively double the stripe width by reading ahead by looking in the other half of the content. For highly read weighted loads they

    On a different dimensions, the rebuild times are faster. In the single disk case (especially if have a hot spare) it will extremely likely be rebuild before another failure comes along (all the while under full load). Since it is basically a disk copy, it goes faster. Also if have dual paths with mirroring divided on the paths have basically an "express highway" to making that happen. Whereas coordinating multiple reads (all remaining in the set )

    RAID 10 doesn't necessarily have the "write hole". There are additional expenses in keeping a parity array powered on all the time.



    It keeps going to be speed will have dropped significantly.

    There is a reason why high OLTP loads go for RAID 10 in a significant number of cases before the parity options. It is a question of service levels under diminished capacity.


    Not really for RAID 10. You have to very carefully select the two drive failure to be a loss of service. Disks don't choose their partner to failure. The paired twin could fail but in a 7 remaining set it is only in the 1/7 ( 14%) range of a rare co-occurance. In other words , you have a small probability already and your multiplying it by 14%. That's gong to be an even much smaller probability.


    Extremely wide stripe widths (>= 4 ) are in the dubious zones. That's why there are RAID 50 , 60 , 100. Another layer of nesting often makes more sense.

    Superwides are good for crotch grabbing, smack talking, drag racing rigs. 24/7/365, highly critical data storage isn't run that way.
     
  25. deconstruct60 macrumors 604

    Joined:
    Mar 10, 2009
    #25
    Holding both disk count and stipe size constant it is going to be rather difficult to compare across the major raid levels. Unless talking about using various subsets of the total disk count, that's one of the major trade-offs between major families.


    Actually it is hard to fathom because with 5 there are the same two "parity" disks as in 10 and with 6 there are actually more disks to write to than in 10 (and 5). Being "afraid" of the parity calculation has nothing to do with selecting RAID 10. Well in the normal mode that is. In diminished capacity and build phase, it makes a huge difference.




    Unless you are on a deadline and have to get the latest dailies out to your customers in the next couple of hours. ........ but the RAID array is thrashing away at a super-wide rebuild.


    It doesn't. The RAW soak up the same amount of storage. The only way you are pushing the 2x is because want to boost the width on the "intermediate" array. As I pointed out you don't necessarily have to boots the width of that one if can use faster components. If using drives that are 30-85% faster then you don't need to appeal to stripe width as much. There are more ways to tackling rotational disk latency than just adding more spindles. Dogmatic adherence to "as big a uniform set of disks as possible" often leads to suboptimal solutions with the modern range of components that can be engaged.
     

Share This Page