Max .tar/.tgz size?

Discussion in 'macOS' started by splitpea, May 31, 2010.

  1. splitpea macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #1
    In order to move several dozen gigabytes of files between systems, while preserving permissions, I'm considering dumping the lot into a tarball, transferring that, and then chowning the contents after extraction.

    Is there a limit to the size of a .tar or .tgz that OS X can create or extract? The older machine is running 10.4.11, if that makes a difference.

    Thanks!
     
  2. lee1210 macrumors 68040

    lee1210

    Joined:
    Jan 10, 2005
    Location:
    Dallas, TX
    #2
    I don't know the answer off the top of my head, but I'd suggest that if both systems are macs a DMG might be easier. I know you can certainly create huge ones... gzip might be better for compression than what apple provides, if compression is a must. I'd say go uncompressed and hook up via FireWire target disk mode. At that point, I might skip the container all together, though.

    -Lee
     
  3. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #3
    Agreed compressed dmg or sparseimage created through disk utility. Also keep in mind if you are going to put this on a fat32 formatted disk then you cannot have a single file larger than 4GB.
     
  4. splitpea thread starter macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #4
    I didn't think copying off a target / external / imaged disk preserved permissions?

    Both systems are Macs, formatted HFS+. One is running 10.4.11 and the other 10.6.x.

    Another option I'm now considering after some research is piping a tarball over SSH... dunno how useful that is, or if it just creates the complete tarball, transfers it, and then extracts it?
     
  5. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #5
    disk images DO preserve permissions. A disk image formatted HFS+ is the same as a physical disk formatted HFS+ in every important way once mounted.
     
  6. splitpea thread starter macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #6
    But when I drag the files off the disk image and onto the hard drive, don't the permissions reset?
     
  7. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #7
    So you're talking about chown'ing and piping over ssh but you're not going to use something like ditto -rsrc or something to copy the files?
     
  8. Sydde macrumors 68020

    Sydde

    Joined:
    Aug 17, 2009
    #8
    As you know, tar stands for "Tape ARchive" (or is it "TApe aRchive?), so it's format is designed to lay out flat: it is really only limited by the file system. However, it only preserves rwx-style permissions, not ACL metatag permissions (10.4 primarily relies on rwx, IIRC, but 10.4.11 might some have ACL action going on). If you really need compression, you might also investigate xar, which preserves almost as much metadata as dmg.
     
  9. splitpea thread starter macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #9
    Well, now that you've alerted me to its existence, maybe I will! My command-line knowledge is a little spotty: enough to get around in most situations, but also enough to hang myself by, so to speak -- and a little bit Linux-centric.

    The reason I'd go for compression is simply that it might be a heck of a lot faster to compress 60GB to 10GB, transfer it, then decompress it, than to transfer it directly.

    By ACL permissions, do you mean owner & group, or something else?
     
  10. lee1210 macrumors 68040

    lee1210

    Joined:
    Jan 10, 2005
    Location:
    Dallas, TX
    #10
    Are you positive? Firewire Target Disk Mode is pretty fast. This is a rough estimate, but at 1-2 min. / GB you're talking a difference of 50-100 minutes of transfer time. Most compression is going to be single-threaded, so you're going to have one core trying to compress (and subsequently decompress) 60GB of data. Just decompression will probably take at least half an hour.

    Just playing devil's advocate, if you have a fast connection between the machines, compression is probably more trouble than it's worth.

    -Lee
     
  11. Sydde macrumors 68020

    Sydde

    Joined:
    Aug 17, 2009
    #11
    Leopard+ uses a dual-layer permission scheme. If there are only the classic owner/group/other permissions, it uses those, otherwise it uses an Access Control List meta-data scheme that allows specific permissions for selected users or groups (look at permissions under Finder info in 10.6, you will see that they may not match what you see in terminal). These are stored in meta-tags, which Tiger may or may not implement, I cannot remember. tgz does not preserve these, but dmg (which can also have compression) does. The meta-tags also store some spotlight info, and some newer applications write to them.
     
  12. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #12
    A 6:1 compression ratio is insane. A lot of people don't realize this, but files like photo/video/audio files are almost ALWAYS compressed formats already. You can't take 100 JPG files and "compress" them in a zip file and expect any appreciable savings.

    I somehow doubt you have 60GB of flat .txt files or something you need to transfer.
     
  13. BertyBoy macrumors 6502

    Joined:
    Feb 1, 2009
    #13
    You're only limited by the size of the tape / disk you put the file onto.

    They start to get slow to search (obviously) when they get that large. I've used them with Oracle backups up to about 300GB before, on real Unix of course, not Mac.
     
  14. splitpea thread starter macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #14
    Well, I currently have neither a FW800 cable, nor a FW400 port on the second machine. So it's USB2 or ethernet, neither of which exactly screams (or 802.11g, but even I'm not quite stupid enough to try to do this over wifi).

    I assume I can afford to lose and re-generate the Spotlight info? Dual sets of permissions sound crazy... I can't even begin to think of all the weird ways things could go wrong...

    Is this additional metadata something that I'll miss on the other end? Most of what I'm concerned about permissions for are UNIXy things like running a web server; the rest of the files are just run-of-the-mill working files that need owner read/write permissions (r/w/x for directories) plus whatever permissions the system needs for Spotlight indexing and such.

    Yes, I realize that most media is already compressed, but a lot of the files I'm transferring are PSDs, TIFFs, MS Word docs, and PDFs, each of which is capable of further compression, plus a zillion random itty bitty text files and such.

    Anyway, if compression doesn't make sense I'll skip the compression. Just need to make sure I can get the data transferred and the necessary metadata (whatever that may be) preserved.
     
  15. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #15
    Ok, I give up. Did you try it and it failed miserably? Or is this some kind of nerd riddle for you?

    You haven't given any reason why permissions on these files are so important, user data (TIFFs PDFs DOCs etc) usually don't have any special permissions on them and it is ok to adopt the new user name and default permissions. Mac OS does this for you for a reason. A .txt file sitting on my desktop is owned by me and has 644 permissions. If I put it on a disk, gave it to my friend, and they copied it to their desktop off the disk and MacOS DIDN'T change the permissions then they wouldn't be able to edit the file. Why are you looking for a way around this?

    Just move the files to the other computer and be done with it. I'm surprised nobody has brought up hidden files or encryption yet.

    Why is this in <Mac Programming>? :confused::confused::confused:
     
  16. splitpea thread starter macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #16
    Don't know what I did to piss you off, but believe it or not, that wasn't my intent.

    I'm trying to get things right the first time instead of spending hours waiting for the data to copy only to find out that it's borked. Is that so unreasonable?

    Yes, for many of those files it's fine. I have others that need to be executable (scripts), or group-writable, or not world-readable (files served via Apache and/or configured to match a remote hosting setup), etc.

    Honestly, each time I think I have an answer, someone comes up with another caveat. I don't like playing with my data when there are variables I don't understand. And yes, I have hidden files that need migrating. They'll get moved.

    Because there's no dedicated forum for command-line/UNIX stuff, and I figured the programming gurus would be more likely to have an answer to the original question than the general "Mac OS X" forum crowd would.

    Sorry to waste your time.
     
  17. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #17
    The thread is a day old, you could have copied 100GB over wifi by now.

    The frustrating thing is that you keep bringing up contradictions.
    #1 contradiction, you are interested in a theoretical limit to how big a tar file can be. Which apparently has nothing to do with what you are asking.
    #2 contradiction, you say you are thinking about transferring it through SSH (vs... afp presumably??) then you talk about how ethernet and USB2 are two slow. Which is especially odd because A) USB cannot be used like firewire to attach a computer as a hard drive which means B) you have a hard drive suitable for copying your files but are seemingly unwilling to use it to copy the files.

    There are two commands that spring to mind.
    dd (which does a block for block copy and is presumably unnecessary for this)
    ditto -rsrc -V (does a heiracal copy preserving permissions, hidden files etc) (also fails gracefully and if you've ever had the pleasure of trying to copy data off a dying hard drive (finder crashing or the copy mysteriously failing) ditto is the best you can do as if it gets an IO error on a file it moves on to the next one)

    I always use sudo in combo with ditto, especially if you're moving things from system owned folders, or have weird permissions.


    So assuming you have a USB disk, or a disk image mounted named DISK
    sudo ditto -rsrc -V /Macintosh\ HD/Users /Volumes/DISK/Users

    No "playing around with your data", everything in your users folder would be copied. Repeat for each super important folder like maybe applications. And then just reverse the process on the new computer. In a way putting it into a zip folder, or something analogous to it, is more "playing" with your data since an error in the file has a much greater chance of corrupting all the data, not just a single file.
     
  18. splitpea thread starter macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #18
    Sorry, I started off this whole thing not even certain where to start. When I googled about copying files and preserving permissions, tarballs were by far the dominant result, but I couldn't find any good info on filesize limits.

    The piping over SSH option was a workaround for if the max file size was fairly small -- it appeared that it essentially turned the whole thing into a stream that was decompressed almost as quickly as it was compressed, preventing the file from ever reaching the max size. However, I wasn't entirely certain how that worked.

    The bit about the slow transfer speeds was an explanation of why I was using compression. If there was a max file size, it didn't seem like transferring using a hard drive would be an option unless I had a way to transfer file-by-file, which wasn't certain at that point.

    Thank you, that is very helpful. I will probably use either that or a DMG, which (after some further investigation) it appears can be "restored" directly into the root of another drive using Disk Utility.
     
  19. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #19
    How can that be an option? I assumed you don't want to/can't overwrite the OS of the newer computer.

    Can you remove the HDD from either computer and put it in either the other computer or a drive enclosure? At that point you could use disk utility (preferably booted off an install DVD) to restore directly from one hdd to the other. Never making an "image".

    Next, in theory you could use the Leopard or SnowLeopard install dvd to "upgrade" the OS that you just duplicated.

    You see what I'm doing? Guessing what the hell is actually going on because you're not providing clear directives.
     
  20. splitpea thread starter macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #20
    You're right, I haven't been especially clear, and I apologize. I hate it too when people ask for help without being clear about what they're trying to accomplish. I was originally just trying to get the information needed to attempt one solution, and it morphed into an exploration -- without a clear direction -- of other possible solutions. Thank you for being helpful and patient.

    To start from the beginning: I'm trying to migrate my data from an old and not-100%-stable G5 running Tiger to a fresh install of Snow Leopard on a new MBP. Everything other than the data I'm planning to simply reinstall / reconfigure for two reasons: a) the platform change means much of it won't migrate at all, and b) if the instability has anything to do with accumulated corrupted plists, incompatible configurations, etc, I don't want to bring it with me.

    I work in web development, so I have a bunch of items (the scripts, Apache-hosted files, etc. mentioned above) that need to be migrated with permissions intact.

    And it needs to all get done in a single evening or weekend because if I migrate some of the data but not all, or migrate the files but don't have time to move over my databases and reconfigure Apache (possibly reinstalling it and PHP and MySQL and Python and a few other things from MacPorts), and a bunch of other annoying not-so-instantaneous tasks that will be necessary in order to have a usable working environment, I'll have to keep working on the old machine, which gets the two out of sync and basically means finding a way to re-migrate or somehow re-sync everything all over again.

    Here's what I have on hand to work with:
    - G5 tower running 10.4.11 with FW400, USB2, 100BaseT ethernet, 802.11g-compatible wifi card
    - MBP running latest 10.6.x with FW800, USB2, 1000BaseT ethernet, 802.11n compatible
    - two USB2 external hard drives
    - one USB2/FW400 external hard drive
    - various flash drives and flash cards (max capacity 4GB)
    - FW400 cable; Cat5e cables
    - 802.11g + 100BaseT router, 100BaseT switch

    Well, you can make a DMG of just a directory, and it turns out that you can "restore" a DMG to the root of another disk without erasing the destination disk. If all the contents of the DMG are contained in a single subdirectory, one could then just move that subdirectory to its correct place in the tree.

    Probably not going to work so well since the source system is PPC and the destination is Intel. Plus I REALLY just want to migrate the data in this case.
     
  21. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    Location:
    Brobdingnag
    #21
    Go to monoprice.com. Buy a fw400-800 cable for about $5 (plus shipping) for a 6 ft cable. Connect computers in Firewire target disk mode. Duplicate files.

    http://www.monoprice.com/ -- home page

    http://www.monoprice.com/products/subdepartment.asp?c_id=103&cp_id=10301&cs_id=1030105 -- 9-to-6 FW cables

    And yes, it's as simple as plugging the cable in. I have several FW800 and FW400 devices here, and I connect them to FW800 and FW400 computers with zero problems simply by plugging them in.


    For file-duplication, you may want to look at apps like Carbon Copy Cloner or SuperDuper. They are designed to make perfect replicas, including xattrs, ACLs, etc. If you aren't a command-line expert on what it takes to make perfect replicas, you should use tools that contain such expertise.

    And if ownership is important, then you WILL need to be running as root, either using sudo and some command like cpio, or by entering the admin password for Carbon Copy Cloner.

    Your earlier terminology is flawed, and conflates two distinct things (ownership and permissions) into one (permissions). Since only root can execute the chown() call, the only way to copy everything is as root. This is all bog standard Unix, from its inception.

    Finally, if you don't do a trial run first, and confirm that everything copies perfectly, you're being foolish. Give it files and dirs with every combination of permissions, owners, groups, flags, resource-forks, xattributes, etc. Since it seems to be more important that the transfer occur perfectly exactly once, any effort expended doing trial runs, preflighting, etc. will be worthwhile.

    And since almost nothing ever really works perfectly the first time, create a recovery plan so if something goes wrong you can postpone the transition and start over.
     
  22. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #22
    Have you not looked into why your G5 is not, as you say, 100% stable?

    Your task seems impossible because it can be rephrased as:
    I want to move everything.
    .. except the things that are making my computer crash.

    With no clear picture of the exact reasons your computer is crashing then you can't know what to move over EXACTLY keeping permissions and everything.

    I don't know why you believe the architecture change makes it harder to upgrade and install. Apple hasn't released any documentation that says, hey don't do this if you're going from PPC to Intel. I used to be a Mac Genius at an Apple store, and this is exactly the stuff I used to do all the time.

    Three things that should be easy.
    #1 use Migration assistant over Ethernet. (not 100% sure that is possible, but there is no reason not to try)
    #2 using install disks, restore your PPC HDD onto any of your externals big enough to fit it, then do the opposite on your new MBP.
    #3 restore your PPC HDD onto an external and use Migration assistant from the external.

    There are ZERO reasons to assume that either of these choices will result in anything but a perfectly stable computer. And if they don't? Now you've done some troubleshooting and can dump your ~/Library/Preferences folder or something. And none of them threaten the integrity of your data.

    Just so you know, there are FW800-400 cables that would allow you to connect the two computers via TargetDiskMode.
     
  23. Sydde macrumors 68020

    Sydde

    Joined:
    Aug 17, 2009
  24. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #24
    That is correct, its FW800 on one side and FW400 on the other side of the cable. The connection will be FW400 fast. You're not gaining anything, its just backwards compatible.
     
  25. Sydde macrumors 68020

    Sydde

    Joined:
    Aug 17, 2009

Share This Page