Looking for advice on an NAS RAID1 solution

Discussion in 'Mac Accessories' started by sportsfrk214, Apr 19, 2015.

  1. sportsfrk214, Apr 19, 2015
    Last edited: Apr 19, 2015

    sportsfrk214 macrumors 6502a

    Joined:
    Sep 18, 2007
    #1
    I'm upgrading from Windows to my first Mac in a couple months, and I figured in the meantime I might as well upgrade my storage solution. I need some real advice though because I'm new to the idea of an NAS solution and I could use some real help.

    I currently use two 2TB external hard drives to back up my data. Every month or so, I make identical changes on both of them so that they mirror each other. I understand this is a time-consuming way of doing things, but I'm paranoid about my data and I never knew about RAID so this is how I'm doing it.

    I've recently learned about RAID, and RAID1 specifically. For my usage, it makes a lot of sense for me to have a setup with two hard drives (at least 4TB each) running a RAID1 configuration.

    So basically, I'm looking for advice on what the best device to use is. Obviously, it needs to be compatible with Mac. It doesn't need any fancy backup software, I prefer to manage what gets backed up on my own. I want something where I can drag and drop onto the hard drive from my Macbook. The only requirement software wise is that I need to be able to password protect the entire drive, or at least parts of it.

    The other thing that does confuse me is this concept of NAS. I understand that this means I can read/write to the hard drive over the internet, but is this a time efficient process? I have approximately 20 Mbps internet connection in my house, and I feel like I wouldn't be able to transfer data wirelessly as quickly as I would with a wire. I wouldn't be transferring data often (every month or so), so I would be perfectly fine with a device that allowed me to plug my laptop into it when I wanted to do major data transfers. I'm guessing USB3?

    Finally, do these devices constantly run? I'd like one that I'd be able to turn off because I may go weeks at a time without using it.

    Would appreciate any advice anyone could give on what I should buy. I've seen brands like Synology and WD, but it's very overwhelming for a newbie. Thanks!
     
  2. Mikael H macrumors 6502

    Joined:
    Sep 3, 2014
    #2
    If I understand your post correctly, your use case may be easiest to fulfill by using TimeMachine with two different (and sufficiently large) USB drives. TimeMachine is an integrated part of the operating system and is very uncomplicated to set up: basically you assign an external (or network) drive as a backup target, and then you choose what directories, if any, you want the program to avoid when backing up your computer.

    A NAS would be overkill in this case. Add to this that RAID is not the same as backup. RAID 1 which you write about is a lot safer than RAID 5, but storing backup data on separate volumes is the safest method to ensure restorability.

    An alternative, if you want a more convenient (and therefore more regular) backup procedure, would be to purchase one USB drive and one TimeCapsule. Your computer would take regular backups to the TimeCapsule every time it was connected to your wireless network and to a charger, and then you could use the USB drive for backups at larger intervals at your convenience.
    A similar setup can be had using some NAS-boxes, but the user friendliness and the reliability of these third-party solutions varies a bit.
     
  3. ColdCase, Apr 19, 2015
    Last edited: Apr 19, 2015

    ColdCase macrumors 68030

    Joined:
    Feb 10, 2008
    Location:
    NH
    #3
    Yeah, one of your apps could mess up the file system and corrupt or overwrite the storage, and because you are RAID1 mirroring both drives will be corrupt/overwritten. For backup RAID1 doesn't buy you anything. It does buy some 24/7 uninterrupted data availability assurance as the user can keep working during the recovery from drive failure.

    As suggested, its better to use the built in OSX time machine capability and two drives. Set up time machine to alternate backups between them. Another option is to use something like CCC to backup your data. I use the built in time machine to back up my internal drives and CCC to backup external drive data to other external drives.

    If you don't want the cable clutter, remember only Time Capsules, AEBS, and OSX server are approved for time machine backups, so an NAS is not going to help you unless you use a TC. Use the TC internal drive for one backup, plug in a USB drive into the TC and use it for your second.

    If you want to spend a little more money for a cleaner network storage solution that is also compatible with time machine, think about a refurb/new/used mini (or any mac computer) running OSX server and attached drive(s).
     
  4. glenthompson macrumors 68000

    glenthompson

    Joined:
    Apr 27, 2011
    Location:
    Virginia
    #4
    As stated, RAID is not a backup, it is for fault tolerance. Ideally, you should have at least 3 copies of your important data with one of them being off-site. If you have the bandwidth, an on-line service like CrashPlan or similar is the easiest way to implement an off-site plan.

    I use TM, CCC, and CrashPlan to provide different backups with different capabilities.
     
  5. sportsfrk214 thread starter macrumors 6502a

    Joined:
    Sep 18, 2007
    #5
    Thanks for all the replies, I do appreciate the information. I'm new to this storage stuff so forgive me if some of the things I say sound dumb.

    Maybe someone can explain to me why RAID isn't backup? As I said, right now I'm taking all my files and manually transferring them to 2 seperate external hard drives. I'm essentially mirroring my data by myself. The thinking is that if one of my external drives fails, I'll still have the other and then I can buy a new drive to replace the old one.

    As I understood, if you have a two hard drive setup, RAID1 allows you to have each drive mirror their data. So if I transfer a file to my external setup, it'll transfer it identically to each hard drive in the setup. This would seem to be a more efficient way of doing what I already do.

    So what am I missing? What I'm saying must be wrong, I'm just trying to understand why. Thanks ;)
     
  6. ColdCase, Apr 19, 2015
    Last edited: Apr 20, 2015

    ColdCase macrumors 68030

    Joined:
    Feb 10, 2008
    Location:
    NH
    #6
    For your use case you are simply constructing a RAID1 and using it as a backup device. Your data is on your computer and you are backing up to the RAID1 box.

    Thats the way we used to backup, but now that TimeMachine supports multiple destination disks, we don't usually setup a RAID1 mirror for the backup destination. We just configure TM to use the two (or three) drives. One fails and you have the other with your data. When you replace the failed unit, just configure TM for it and perhaps do a full backup. But I have to admit that on one machine I have a RAID1 mirror set and use CCC to routinely backup my working data libraries to it... belts, suspenders... whatever.

    Just another way to do it, either way should work fine for you.

    There is a tendency here for a knee jerk reaction when folks mention RAID1 as usually they are thinking mirroring their working drive as the backup strategy. Thats the scenario thats not a good backup strategy... but thats not what you are doing.

    By the was a NAS is network attached storage (wireless/ethernet) where a DAS is directly attached storage (USB, Thunderbolt). I think you are asking about DAS. If you are indeed talking about a NAS with RAID1 as a timemachine destination, thats not going to work in a configuration approved by Apple, although there have been quite a few here that have luck doing it anyway. I'd use something like CCC for that, however.
     
  7. Mikael H, Apr 21, 2015
    Last edited: Apr 21, 2015

    Mikael H macrumors 6502

    Joined:
    Sep 3, 2014
    #7
    RAID1 is not backup in and by itself, because what it does is simply to ensure that the same changes are made across all disks in the RAID set. This means that not only backup data but also any destructive file system changes are replicated.

    Effectively, compared to your current setup, you'd go from [main storage + two backups] to [main storage + one highly available backup], and the latter simply isn't as safe as the former, even though it does account for some failure conditions.


    Essentially: To keep the high level of data security you have today, you'll still want to make backups to two separate backup targets. OS X and some optional Apple-produced hardware (namely the TimeCapsule) allows you do do so more or less automatically with very little administrative overhead and without resorting to third-party backup solutions. You can also go the cheaper route of regularly connecting separate hard drives to your computer at somewhat regular intervals, or as I mentioned in my previous post a combination, with one TimeCapsule (or other network connected) backup target, and one or more USB disk(s) (which may be stored on a separate location when not in use).


    For reference: I myself run a Linux server as a TimeMachine target, in a technically similar but more flexible manner compared to running backups to a NAS box. While this has worked flawlessly for me, this is not something I would recommend to J Random End-user: The support simply isn't there from Apple's perspective, and maintaining a Unix-like server does require some more knowledge than one can reasonably expect an end-user to have.
     
  8. glenthompson macrumors 68000

    glenthompson

    Joined:
    Apr 27, 2011
    Location:
    Virginia
    #8
    If you accidentally delete a file it's deleted from both drives. If the power supply in the NAS craps out and destroys both drives you have no backup. If someone steals the NAS, same story. If it burns up in a fire? Get the idea.

    I have a NAS that I have setup as Raid 1 but it is for fault tolerance. I take the unit with me in my motorhome and recovering from a single drive failure on the road can be problematic. The NAS stores all my media files as well as backups for our two Macs. All the data on it is backed up at home and to Crashplan.

    When designing a backup strategy you need to identify the possible failure modes, assess the risk, and determine how you can recover from them. One common scenario in corporate disaster recovery testing is to recover from a data center loss that also includes certain key people being unavailable. Can you get things running if your main DB/2 expert is dead?
     
  9. AFEPPL macrumors 68020

    AFEPPL

    Joined:
    Sep 30, 2014
    Location:
    England
    #9
    Raid 1 is safer than raid 5 - that's a new one!!!!

    Raid 1 is a mirror set, both drives operate as a single unit and the guest stores the data over both drives. The downside of raid 1 is it has to write everything twice, so the write performance goes down compared to other versions of Raid. On the flip side Raid 5 is using a parity strip to provide data protection so does not have the same write penalty. Read performance of a Raid set is directly related to the number of drives in the set. With any Raid it's important to use a dedicated Raid controller and not software raid, firstly software impacts the device, it's not as sophisticated and some of the better solutions have battery backed up cache too.

    Both Raid 1 and 5 have the SAME level of protection, they can both survive a single disk failure.
    Raid does not protect against deletion or damage to the unit for example.

    A NAS solution is the direction to go. You can have power up times set in the device so it powers down over night and backup in the morning if you wish. NAS doesn't mean internet. A NAS device can serve data to the internet if you want, but a NAS device is installed behind the route so your internet speed has no relevance as to the internal network.
     
  10. sportsfrk214 thread starter macrumors 6502a

    Joined:
    Sep 18, 2007
    #10
    Thanks for all the great info everyone, I definitely learned a lot that I didn't know! Much thanks :cool: I'll probably start off with Time Capsule in some function since it's designed to work with Apple products.
     
  11. Mikael H, Apr 24, 2015
    Last edited: Apr 24, 2015

    Mikael H macrumors 6502

    Joined:
    Sep 3, 2014
    #11
    Sorry, but I need to correct you on a couple of points:
    First of all: OK, I'll grant you that if you compare a 2-drive RAID1 and a 3-drive RAID5, then they can each lose one disk. If you build anything larger, though (which effectively turns the RAID1 into a RAID10), then a RAID5 by definition still can only afford to lose one drive no matter how large you make the disk set, while the RAID1 set will be able to potentially lose up to 50% of its drives with no data loss.
    And remember that the smallest possible RAID5 set has 3 disks that can potentially break compared to the 2 disks of a RAID1, but the RAID5 can still only afford to lose 1 drive.
    Ergo: RAID1/RAID10 is safer than RAID5. Much safer.

    Second, you're dead wrong in claiming that RAID5 also is faster than RAID1 (and RAID10). Due to the parity stripe, RAID5 has a 4 I/O operations penalty (read data, read parity, write data, write parity) compared to the 2 I/O operations penalty of a RAID1 (write data, write copy of data). Add to this the potential overhead on the CPU for calculating the RAID5 parity if you don't have a dedicated controller card for the drive solution of your choice.
    As soon as you've saturated your cache, a RAID5 set will lose in I/O capacity compared to a RAID10 set with an equal number of disks.

    The only reason for using RAID5 over RAID1 or RAID10 is because of the relatively low cost of purchase: You only lose the equivalent of one drive worth of storage space when running RAID5, while RAID10 will cost you 50% of your disk capacity in redundancy data.

    http://www.miracleas.com/BAARF/
     
  12. AFEPPL, Apr 24, 2015
    Last edited: Apr 24, 2015

    AFEPPL macrumors 68020

    AFEPPL

    Joined:
    Sep 30, 2014
    Location:
    England
    #12
    I'm sorry but you are completely off base, so i will correct your corrections. You need to be careful what you read on the internet it's not always correct or correctly interpreted....!

    Firstly RAID 10, or 1+0 as it should be correctly stated is a stripe of mirrors
    Size/number of disks has no relevance in terms of changing between 1 and 10, they are NOT the same thing and never have been the same. RAID 1 is not stripped across disks, its just mirrored blocks over a pair of disks!!!

    I don't claim RAID 5 is faster "due to a parity strip", its faster due to "data BEING stripped" over more disks, so more spindles are serving the data (potentially). You are making something real simple, complex, and adding things i didn't say or imply.

    You are making huge rookie assumptions with RAID 1, RAID 1 does NOT strip data is the key point to remember over all the disks that's in the set. It's a one to one mirror between a PAIR of disks only. The fault tolerance should be stated as n-1 for RAID 1. So if you have a RAID 1, 4 disk RAID set, it's two pairs or two distinct mirrors. Disk 1 and 2 are a pair as are 3 and 4 (can be any mix of the drives stated 1:3 1:4 etc etc) in the first mirror if disk 1 fails, you are cool, you have the other half of the mirror. If disk 2 was then to fail too you are screwed - its that simple, you can't "lose 2 disks". You need to understand how data is stripped or not as is the case. Its true you "could" lose 2 disks, in a 4 disks mirror set, but it would have to be two disks from different mirror pairs (or RAID groups if you want the correct term - 4 disks would 1+1 and 1+1 i.e. 2 groups).

    RAID 1 is better for some profiles or workloads and RAID 5 is better for others, we were not given that info, and i made a huge generalisation. In terms of performance is the requirement large datafiles or small ones? Are the reads/writes random or sequential? In "general" RAID 5 would be favoured for DBs etc, i.e. heavy reads (not sure how i put "not" in my previous post - a case of not reading what i was writing or thinking ahead and of other things as the same time), that was wrong, my bad and not what i meant to say.

    Stated simply, RAID 5 is better for heavy read operations. In general RAID 1 would be better for heavy write operations, but with cache on the controllers, all bets are off depending on how and what its set to and the type of data. Cache on write for example would help with Raid 5 and some workloads, but it won't change the fundamental differences. The other point i was making with RAID and it's not a written rule, more of a rule of thumb, is for a given RAID set performance will increase by 50% if you double the number of disks within the set, but a mirror is still a mirror regardless!.

    Don't go to I/O operations as that could easily expand to much more complex things. The data for the parity is calculated at write, you would need to know the number of disks in the set first, what's the data size? What's the block size? unless you know that you can't even start to think about the number of write operations (its linked to the number of disk for anything other that RAID 1). Is the cache set to Write? or Read ahead? To write data it DOES NOT matter if its "parity" or "real data" for want of a phrase (it's a block of data at a given size). Parity is calculated on the fly and wrote as an "additional operation" so at the highest possible level you are on the right road (hence why writes are not as fast generally). Turn to software Raid and everything changes again as you cant offload operations to the storage controller.

    The only reason to use "anything" is when you understand what you are wanting to achieve. i "could" use my TV to heat a room for example, it would/could work, maybe not the best choice however. Or i could use my R8 as a school bus - might be limited on the number of passengers i can carry... it would work however.

    There is "no" only use Raid x for this or that. RAID 5 provides fast reads, better space efficiency while providing tolerance. Yes, you need a min number of drives for each of the RAID implementations, and not all are the same, RAID 0-2, RAID 1-2, RAID 5-3, RAID 5E-4, RAID(10)1+0-4, RAID 5(e) with a spare can provide double disk failure. you lose the capacity of another disks however.

    The levels of RAID exist for different reasons, so have their own penalties.
    RAID 1+0 provides great write and read performance, down side is you need twice the number of disks. RAID 0 has no protection, but great space efficiency and performance as its stripped over multiple disks, RAID 1 vs RAID 10 has no stripping and lacks raw performance as data is being serviced by less spindles. RAID 5 is a middle house, great reads due to stripping, not so great writes due to parity.

    This might be an easy way to view the reality of RAID levels. Lets take the same 4 disks, each can support 69IOP/s (typical for SATA 7200RPM writing Random 64K blocks) and a read write profile of 50%:50%, this is what the performance profile would look like.

    RAID 0 (Stripe set)
    Number of drives per RAID group = 4
    Single RAID group performance = 276 IOP/s

    RAID 1 (Mirror)
    Total number of drives = 4 (2)
    Single RAID group performance = 92 IO/s IOP/s

    RAID 5 (Stripe set with parity)
    Total number of drives = 4
    Single RAID group performance = 110 IOP/s

    RAID 10 (Striped mirrors)
    Total number of drives = 4
    Single RAID group performance = 184 IOP/s

    If you/we change the read write mix for the workload, this is where you see the numbers flip and WHY you need to know what the purpose is of the RAID set.

    So 80:20 reads?

    RAID 1 (Mirror)
    Total number of drives = 4 (2)
    Single RAID group performance = 115 IOP/s

    RAID 5 (Stripe set with parity)
    Total number of drives = 4
    Single RAID group performance = 173 IOP/s

    Change the number of disks in the RAID set and lets see what happens?

    RAID 1 (Mirror)
    Total number of drives = 8 (2)
    Single RAID group performance = 115 IOP/s

    RAID 5 (Stripe set with parity)
    Total number of drives = 8
    Single RAID group performance = 345 IOP/s

    So 20:80 reads?

    RAID 1 (Mirror)
    Total number of drives = 4 (2)
    Single RAID group performance = 77 IOP/s

    RAID 5 (Stripe set with parity)
    Total number of drives = 4
    Single RAID group performance = 81 IOP/s
     
  13. AFEPPL macrumors 68020

    AFEPPL

    Joined:
    Sep 30, 2014
    Location:
    England
    #13
    Most NAS drives have a Time Machine builtin.
    Take a look at the synology 215 or 415

    https://www.synology.com/en-uk/
     
  14. Mikael H, Apr 25, 2015
    Last edited: Apr 25, 2015

    Mikael H macrumors 6502

    Joined:
    Sep 3, 2014
    #14
    See? I knew that you either didn't grasp what you were talking about or were oversimplifying in the post I first replied to. I'm glad we got that out of the way and that it turned out to be the latter. Now we're pretty much agreed. :)
    (And no, you shouldn't believe everything you read on the Internet, of course, but I found the page pretty funny and semi-relevant to the discussion.)

    Again: The only use case I would agree with you on regarding RAID5 would be large sequential I/O operations with a focus on reads, in a situation where RAID0 isn't an option for some reason or another.
    I would argue that most database- and file server-like traffic can be said to consist of relatively small random reads and writes which cannot be guaranteed to overlap exactly with the stripe size and -count, which in turn is problematic for a RAID5 set (this is where the full I/O penalty comes into play).

    But if you create multiple RAID1 sets (that is: use two or more pairs of drives in RAID1 configuration), then the data is striped over the sets unless your controller (or software) is stupid - and that is what is generally called RAID10.

    If write operations are a substantial part of what you do on your RAID set, RAID10 generally kills RAID5 in performance for a comparable number of member disks. My benchmarks at work told me that I'd need almost twice the number of disks in a RAID5 set to reach the kind of I/O capacity a RAID10 set gave me on a typical workload for our company, and then latency on individual transactions would still be "horrible", relatively speaking.
    Again: RAID5 is OK for large, sequential writes (preferably in chunks that are easily divisible into stripe size and -number), and is outright better than RAID10 for sequential reads. For pretty much any other situation, RAID10 should be used if you can afford it.

    Concretely speaking, I use RAID10 sets for typical file servers, database servers, etc, where I can expect multiple concurrent I/O operations and less than ordered write operations.
    RAID5 is mostly used by me in situations where price is a bigger issue than performance and data safety, or where the situation plays to its advantage (for example a backup-to-disk solution that, you guessed it, outputs its data sequentially in nicely ordered 4GB archives to multiple on- and off-site storage solutions.)


    When it comes to data safety: It's very simple maths. For a RAID5 of n drives to fail, it's enough that 2 drives die for you. For a RAID10 of n drives to fail, you would need two drives in the same pair to die. Already in the smallest possible RAID10 (4 drives) that should be statistically less likely to happen than the failure of any two drives in a RAID5. Ever had a bad batch of disks delivered? I have...
     
  15. haddy macrumors regular

    Joined:
    Nov 5, 2012
    #15
    Guys... I have 2 x Synology DiskStations....DS215j (10TB) and DS414 (16TB) on my LAN.

    Both are RAID 0 ....... I want the fastest throughput. Not interested in data mirroring or recovery.

    From RAID Wikipedia:
    "Correlated failures[edit]
    In practice, the drives are often the same age (with similar wear) and subject to the same environment. Since many drive failures are due to mechanical issues (which are more likely on older drives), this violates the assumptions of independent, identical rate of failure amongst drives; failures are in fact statistically correlated.[11] In practice, the chances for a second failure before the first has been recovered (causing data loss) are higher than the chances for random failures. In a study of about 100,000 drives, the probability of two drives in the same cluster failing within one hour was four times larger than predicted by the exponential statistical distribution—which characterizes processes in which events occur continuously and independently at a constant average rate. The probability of two failures in the same 10-hour period was twice as large as predicted by an exponential distribution.[64]"
     

Share This Page