Partitioning large RAIDs

Discussion in 'Mac Pro' started by jdv, Aug 15, 2010.

  1. jdv macrumors newbie

    Joined:
    Oct 18, 2007
    #1
    Hi,

    What are the considerations partitioning large external RAIDs? I've just put 2TB drives in my three RAID 5 boxes so I'm thinking now of how best to organise them. I'm a photographer and film maker working with large files in Photoshop and Final Cut. This is what I've got now:

    Boot/application drive,
    6TB internal RAID 0 for fast still image work,
    18TB SAS run off an Areca 1680x mainly for fast video processing work,
    7.5TB eSATA and a 20TB eSATA boxes run off a HighPoint 2314 for backup and archiving.

    Should I partition? If so, in what way?

    Cheers,

    Jonathan
     
  2. Sean Dempsey macrumors 68000

    Sean Dempsey

    Joined:
    Aug 7, 2006
    #2
    Partitioning makes no sense in your scenario.

    Unless you need to run bootcamp, or want to be able to format only a portion of your raid, there's no point.

    I could maybe see someone partitioning if they wanted to be able to defragment every few weeks, but your files are so large it doesn't matter.


    So no, don't partition.
     
  3. matteusclement macrumors 65816

    matteusclement

    Joined:
    Jan 26, 2008
    Location:
    victoria
    #3
    Holy mother of god. That is ALOT of storage!!
    I thought my 12TB was over the top....
     
  4. dknightd macrumors 6502

    Joined:
    Mar 7, 2004
    #4
    I think you should partition - not the raid, but where the raid is, and how it used.
    Backup disk on a local machine is a smart thing. Even better, if possible, is a second backup on a different machine (ideally in a different location)
     
  5. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #5
    Please be aware that the Highpoint 2314 is not adequate for RAID 5 (it's a Fake RAID controller = software + system resources to manage the array, and can't actually handle the write hole issue associated with parity based arrays), as it has no NVRAM solution or the recovery options that can be found on proper cards (separate cache and processors are a major clue, and typically start in the $300 range, such as the ARC-1210). This means you're data is at risk, and you're going to be burnt at some point if you continue to do this (quesion of when, not if). Even when used with a proper UPS system. And the problem gets worse the larger the member count. So please keep this in mind.

    With that controller, you'd be better off runing a JBOD to minimize the system resource needs (multiple level 10 arrays would increase the CPU % requirements needed to manage it, but may be another option if you can deal with the increased load).

    Otherwise, if you wish to stick with RAID 5, you'd either have to add it to the existing Areca (SAS expander capable enclosures; they still work with SATA disks), or a separate card. Adds cost either way (including enterprise grade disks), but is at least secure (you'd still want to break up the sets into smaller member counts though; say 8 members, 12 max).

    As per the PM enclosures, what are these disks actually used for?
    Primary storage or backup? Or both (separate arrays)?

    Now for the more general bit...

    You can partition an array, but this can affect performance (done intentionally for a boot or other high performance requirement to keep the data on the outer most tracks). But if you're using all the partitions for data, as you go inwards, it slows down (even if the volume is the entire capacity). Worse, simultaneous access will slow down the avg. throughput of any of those arrays, as the data is still on the same disks.

    Without further information, there's no more information that can be offered ATM. :(
     
  6. jdv thread starter macrumors newbie

    Joined:
    Oct 18, 2007
    #6
    Hi,
    Thanks for your comments. Your expertise is very much appreciated.
    A bit more info:
    My primary 'Pictures' folder is on the internal RAID 0 in my Mac Pro.
    My primary 'Video' folder is on the external SAS RAID 5 controlled by the Areca.
    The 10 drive (20TB) external PM box is used for back-up from these two sources.
    The 5 drive (7.5TB) external PM box is used for off-site back-up of key folders of the same material. I never have the two PM boxes plugged in at the same time.

     
  7. jdv thread starter macrumors newbie

    Joined:
    Oct 18, 2007
    #7
    So,
    For the record:
    Writing a 7GB file to the Areca SAS RAID required an average of approximately 5% CPU usage, whereas writing the same file to the PM RAID 5 via the Highpoint 2314 card took 15% CPU usage. (Both on a 2.8Ghz 2008 8-core Mac Pro). The file was just drag and dropped from the Finder.

    Write hole issue remains but CPU load not too disastrous for the Highpoint.
     
  8. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #8
    Given the specifics, this setup makes me nervous.

    Now let me explain:
    Pictures:
    RAID 0 isn't the safest thing for primary data, but can be acceptable if you've a proper backup solution (not just a source for it, but the interval they're made). The reason is, there's going to be some data lost if you were working on something when it failed (time between last backup completed and the array failure). This can be acceptable if you've the time to deal with fixing the array, restoring the data you have on the backups, and re-performing the lost work.

    Where the problem is for me, is that you're backup solutions are software based RAID 5. Since the data is written in this case far more than it will be read, you run the risk of that data being corrupted during a write. This will render the data useless to you, and a significant loss. Not acceptable to my way of thinking (I think of the R in RAID, which means Redundant). A level 5 in it's basic premise is redundant, but it's not secure enough when done by a software implementation.

    That's what I was getting at with the Highpoint 2314. You can reduce this risk with a good UPS system (MP, monitor, and both PM enclosures; you should be running one for the Areca anyway). But it's still not as secure as the Areca you have would be (since it contains an NVRAM solution to the write hole issue).

    Video:
    This seems fine, but what are the particulars?
    • Model of Areca card
    • Enclosures used and cabling (particularly the drive count)

    I ask, as I'm wondering if the Areca could be better utilized (i.e. run a separate SATA array in level 5 for the Primary if there's sufficient ports, or possibly via SAS expanders <they will run SATA disks as well>), and change the PM enclosures to a JBOD configuration.

    Please note, that with the Areca, the drives must be enterprise models for the recovery timings (consumer units are set at 0,0 and enterprise at 0,7 <read/write respectively, and values are in seconds>).

    Backups:
    You need to get these swapped out of a software implementation of level 5, as the write hole issue nullifies the presumed redundancy. Corrupted data is useless, so why make sure the corruption survives if a disk dies?

    You'd be best served IMO by a JBOD configuration, as the risk is that of a single disk. The "downside" is it also has the performance of a single disk, so the time needed to complete the backup will be increased. But as speed isn't the primary concern here, it's an acceptable trade-off for your needs.

    This is the primary problem with your setup as I see it. If you change this over, your primary arrays will have sufficient protection. But not as your backup is currently configured.

    BTW, what's the OS location?

    See above.

    RAID 5 is fine for reads. But its the writes that's the danger. Since the primary usage of the backup system is to write data to it, your risk is higher than other forms (1/10/JBOD) that corruption will occur during a backup process.

    Unfortunately, you won't know it either, and won't discover it until you've had a catastrophic failure of one of the primary arrays, and restore the data. At that point, it's too late. Some or all of the data is garbage = gone. :eek: :(

    I'm under the impression you've a second copy (duplicate it to each PM enclosure), and you're betting on the fact that the same data won't be corrupted twice over. Better than a single source under certain conditions (backups run separately). But if both backups are written simultaneously (which can be set in backup software I'm familiar with), it will be duplicated. :eek: Quite an ugly situation to find yourself in.
     
  9. advres Guest

    advres

    Joined:
    Oct 3, 2003
    Location:
    Boston
    #9
    Is any of your backup off site? think fire and water damage. Without a complete off site backup you are kind of taking a big risk especially since it is your career.
     
  10. jdv thread starter macrumors newbie

    Joined:
    Oct 18, 2007
    #10
    Okay, a very important point.
     
  11. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #11
    This would work, and no need to buy anything (SAS expanders and/or additional enclosures; you can find enclosures with the SAS expanders included internally).

    The box and card are fine, but the disks = not so much. There's more to them than just the TLER values, as they've additional sensors and better specifications. The sensors montitor the mechanics in order to prevent the heads from physically hitting the platters. The ratings are important as well, such as an MTBF of 1.2M hrs (most consumer units are usually ~ 800k hrs), and an Error Rate of 1E15 (most consumer disks are 1E14).

    In simple terms, the enterprise drives are made to be abused in high availability conditions (operational 24/7/365). I've been under the assumption that you're earning a living with this system (given the Areca card), so it would be in your own best interest to swap out the disks (do it one at a time by installing it physically, declaring it a hot spare, pulling a disk, and let it rebuild; repeat until all drives have been replaced).

    And BTW, I've not had good luck with Hitachi's consumer disks, especially in RAID (different P/N's <1TB units>, didn't work with the Areca's when the firmware revision was 1.45). Nor did they work under CalDigit's RAID card (what a pile of junk). They just weren't stable under RAID, and they died rather quickly when re-configured (single disk or software implementation of JBOD). Hitachi's support was also very unhelpful with the CalDigit situation.

    So I just don't have much faith in their consumer disks. Their enterprise SAS disks are another story.

    See above. ;)

    It's a big deal.

    Take a look here, and see if it helps you understand what's going on. :)

    You recover the data stored on the remaining drives, which is still in tact, with recovery software after connecting it separately (i.e. remove the JBOD from the controller/system, and access each disk as a single disk, then run software).

    Just adding an empty drive won't do anything with JBOD implementations, as there's no automatic recovery (no place to get/rebuild the lost data from). That's what 1/10/5/6/50/60 do. But for parity (5/6) and nested parity (50/60), you need to be running a proper RAID card, not the 2314 you've been using. You can try to add it to the existing JBOD, but the data that was on the DOA disk is gone, unless it's in a separate location.

    But since it's intended as a backup location (primary data on the Areca), it's sufficient (JBOD set appears as a single large volume), as it's not a primary set, and accessed nearly as often (reduces wear-and-tear on the drives in the set).

    This is fine if you wish, and will have much better random access times for loading applications (particularly any libraries they may require).

    General info:

    This will help, but it's still not the same as running a level 5 on the Areca (reason is the parity is built off of the data, and if that data is corrupt, the parity information created would result in corrupt data if it's used). This isn't a minor difference that you can ignore.

    Simply put, software RAID 5 implementations CANNOT deal with the write hole issue. Period.​

    Now, based on the idea that your PRIMARY DATA remains on a stripe set:

    If you continue to use the 2314 for RAID 5, you're going to get burnt (presuming the data is absolutely critical = getting paid to produce it, not just personal value = DATA loss is NOT an OPTION). Seriously. Stop using the 2314 for RAID 5.

    If you don't mind losing data that may have taken years to produce, go ahead and stick with what you're doing. :eek: :p

    Now if you place all the PRIMARY DATA on the Areca's RAID5:
    You can leave the backup system as is. :eek: :D

    The reason, is that the PRIMARY DATA is on a proper RAID 5 installation (has the NVRAM solution to the write hole), and the likely failure (total = data cannot be recovered) of both is very low. During a full backup, any corruption to the software RAID 5 backups will usually be corrected by the Areca (data fed to the 2314 is correct; won't happen during an incremental though).

    Leaving the current backup situation as is, allows that system to rebuild the data on the disk that died, unlike a JBOD (and assuming only one disk failed).

    But understand, to do this, the PM enclosures as well as the one used for the Areca must be on a UPS (as well as the MP and monitor), and the parity check enabled on the 2314.​

    This last method is also rather easy to do (less work).

    But you need to get enterprise disks on the Areca, and make sure the MP, monitor, and all the external enclosures are running on a good UPS (ideally, an Online type; more expensive, but well worth it - gives better protection, as there's no switching involved). The next step down (and bare minumum), is a Line Interactive (switches between the battery and wall, but should have what's called a step transformer that pulls up the output voltage under low wall voltage conditions - as the Online type always runs off of the battery and inverter, this isn't an issue, as it always outputs the spec voltage as long as it's not overloaded.

    Anything under that, avoid it like the proverbial plague. :eek:
     
  12. jdv thread starter macrumors newbie

    Joined:
    Oct 18, 2007
    #12
    Hey Nanofrog,

    That's an absolutely amazing response. Really very much appreciated.

    I'll take your advice and get a UPS. Will wait and see on the enterprise drives for financial reasons (as far as I can tell from the storage forums and from Areca's own drive list, only the 2TB Hitachi enterprise drives are reliable in RAID and they cost £225/$350 each).

    By the way, probably old news for you but did you see this:
    http://www.tomshardware.com/reviews/hdd-reliability-storelab,2681.html

    Thanks once again. I definitely owe you but I can't think of how to repay.

    Cheers,

    Jonathan
     
  13. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #13
    :cool: NP. :)

    The WD2003FYYS works too on the SAS cards (RE-4 = 2TB @ 7200rpm, not the RE-4GP, which is the green version). Span.com (link is to the drive, not main page) sells them for £207.98 (VAT included), which is less than the Hitachi's. :D

    BTW, here's the link to the HDD Compatibility List (.pdf file), and I'm glad you knew to check.

    I've seen the link, but note that the information is consumer models, not enterprise. Renders it moot in this situation IMO, and even they state themselves, there's limitations to the results (small sample sizes, and doesn't cover all the models).
     
  14. slater-k macrumors regular

    Joined:
    Jan 13, 2008
    Location:
    London
    #14
    you can also get them at scan.co.uk for a tad less

    And i would definitely go with nanofrog on the enterprise drives because if you do have a problem it's one thing less to investigate, what with there being so many crucial parts to the system (i had a problem with the enclosure that only showed itself every month or so and was very hard to track down).

    Cheers
     

Share This Page