Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

mattspace

macrumors 68040
Original poster
Jun 5, 2013
3,458
3,056
Australia
I'm still new to using APFS and snapshots for Time Machine. I was very much commandline comfortable with tmutil etc, but having just discovered you can't delete parts of a single backup on an APFS TM disk, the way you could on an HFS+ one, I figured I shouldn't trust any of my existing knowledge or assumptions...

If I change the finder tags on a file, will that count as a new / different file for Time Machine's purposes, and cause a whole copy of that file to be added to the size of the next backup?
 
This is difficult to test with TM, but I have tested with Carbon Copy Cloner which allows the creation of many backup tasks. CCC uses APFS snapshots in the same way as does TM.

I created a new CCC backup task and new destination APFS volume and backed up 3 image files to it. The size of the volume was 260.8 MB. I added a tag to a 146 MB file and repeated the backup. The volume (which now has two snapshots) is 260.9 MB. Did a bit more removing and adding tags and the size has now increased to 261.1 MB. So the adding and removing Finder tags, only adds a very small amount to the backup volume.

So, almost certain that TM does not significantly increase the size of the backup volume when adding a tag.

I have tried to be precise by using APFS language for "volume" and "snapshot". Each backup creates a new snapshot on the volume. I am running macOS 14.5.1
 
Last edited:
This is difficult to test with TM, but I have tested with Carbon Copy Cloner which allows the creation of many backup tasks. CCC uses APFS snapshots in the same way as does TM.

I created a new CCC backup task and new destination APFS volume and backed up 3 image files to it. The size of the volume was 260.8 MB. I added a tag to a 146 MB file and repeated the backup. The volume (which now has two snapshots) is 260.9 MB. Did a bit more removing and adding tags and the size has now increased to 261.1 MB. So the adding and removing Finder tags, only adds a very small amount to the backup volume.

So, almost certain that TM does not significantly increase the size of the backup volume when adding a tag.

I have tried to be precise by using APFS language for "volume" and "snapshot". Each backup creates a new snapshot on the volume. I am running macOS 14.5.1

Thanks for that. It would seem to suggest that it's safe to use tags as a changeable aspect of an unchanging file, without having to worry about blowing out the Time machine backups.
 
Time Machine on APFS copies only the difference, so is only a few bytes changes, it copies only the block that contains those few bytes.
 
  • Like
Reactions: mattspace
Time Machine on APFS copies only the difference, so is only a few bytes changes, it copies only the block that contains those few bytes.

So if you make a small change within something like a VM's disk image, is that going to trigger a backup of the whole thing? Or is it just that the tag Metadata is something attached onto the file, rather than the file itself that spares it from a full-file backup?
 
So if you make a small change within something like a VM's disk image, is that going to trigger a backup of the whole thing? Or is it just that the tag Metadata is something attached onto the file, rather than the file itself that spares it from a full-file backup?
Tags are stored as an extended attribute. Extended attributes are stored separately from the file data.

A "small change" to a VM is likely to be many changes throughout the VM's file system. This will trigger a backup of the file. How much space will be consumed on the backup disk is somewhat uncertain, but likely to be large. As a general statement, virtual machines are not TM backup friendly and should be excluded. Also your recovery requirements for VMs may well be different to those of the host computer - e.g. for my linux VMs I only need/want to be able to recover from a backup in the past week whilst I want to keep host Mac backups for 12 months or more.
 
I have tested with Carbon Copy Cloner
I have also tested Carbon Copy Cloner (Ver. 6) and concluded that it does not re-copy a large file's data when only an extended attribute has changed.

Time Machine on APFS ... copies only the block that contains those few bytes.
(Emphasis mine.) Talking now about file data (not extended attributes), I've heard this, also. Has anyone actually tested this for either TM or CCC?

Say you have a 10GB file, but only a single byte is changed. Does TM for APFS really only copy the single data block (which contains the changed byte) to the backup volume? Keep in mind that APFS clone operations (copy-on-write) only work within a single volume. So I don't see how that addresses this question.
 
(Emphasis mine.) Talking now about file data (not extended attributes), I've heard this, also. Has anyone actually tested this for either TM or CCC?
Yes: https://eclecticlight.co/2021/04/09/time-machine-to-apfs-how-efficient-are-backups/ and the other linked posts.
Say you have a 10GB file, but only a single byte is changed. Does TM for APFS really only copy the single data block (which contains the changed byte) to the backup volume? Keep in mind that APFS clone operations (copy-on-write) only work within a single volume. So I don't see how that addresses this question.
All snapshots are in a single volume, so eligible for clones.
 
Say you have a 10GB file, but only a single byte is changed. Does TM for APFS really only copy the single data block (which contains the changed byte) to the backup volume?
Thanks for that link -- it's a good read. But it doesn't answer my question, because he only tested with two sparse files and a clone of a sparse file. He concluded that TM to APFS can preserve the sparse and clone nature of the source file onto the destination volume, which is nice.

But I'm wondering about a large "regular" file (not sparse, not a clone of anything). How does TM know which block(s) have changed in the source file (on the source volume) relative to the version on the backup volume? How does TM write just those changes to the backup volume?

I'm sure it has something to do with snapshots and "extents" (I think is the right term) but I just can't work it out completely.
 
But it doesn't answer my question
I have done a test with CCC because it is easier to manage test backup sets and I don't want to mess up my TM backup. I created a couple of 10GB files - one with all zeros and other alternating hex1A. Backed them up with CCC - consumed about 20GB on the backup volume. Added a few bytes to the first and modified a few bytes of the second. Backup again - the backup volume is still about 20GB.

There may be particular circumstances and it was CCC not TM, but it does seem that modifying a few bytes does not cause the destination volume to duplicate data. My understanding is that CCC (and TM) do "write" the modified files to the destination volume, but APFS manages the volume using clone files to conserve space. There are frequent new versions of APFS and I suspect that this is an area where the APFS file system is tweaked.

N.B. My "understanding" is no better than yours! We both have to peer into the darkness.
 
Last edited:
  • Like
Reactions: Brian33
Thanks for that link -- it's a good read. But it doesn't answer my question, because he only tested with two sparse files and a clone of a sparse file. He concluded that TM to APFS can preserve the sparse and clone nature of the source file onto the destination volume, which is nice.

But I'm wondering about a large "regular" file (not sparse, not a clone of anything). How does TM know which block(s) have changed in the source file (on the source volume) relative to the version on the backup volume? How does TM write just those changes to the backup volume?

I'm sure it has something to do with snapshots and "extents" (I think is the right term) but I just can't work it out completely.

It would definitely be interesting to see if a 1GB+ Photoshop file generates a second 1GB+ backup when adding an extra layer, for example.
 
I have done a test with CCC because it is easier to manage test backup sets and I don't want to mess up my TM backup. I created a couple of 10GB files - one with all zeros and other alternating hex1A. Backed them up with CCC - consumed about 20GB on the backup volume. Added a few bytes to the first and modified a few bytes of the second. Backup again - the backup volume is still about 20GB.
Thank you very much, @gilby101 !

That's pretty convincing that only the added/modified bytes are added to the backup volume, at least with CCC. Very interesting, and good news! I just wonder how it's done. If only Mr. Bombich had a technical class, video lecture, or document on the details! (Not that I would want him to disclose all his hard-earned secret knowledge to competitors, though. :) )

There may be particular circumstances and it was CCC not TM, but it does seem that modifying a few bytes does not cause the destination volume to duplicate data. My understanding is that CCC (and TM) do "write" the modified files to the destination volume, but APFS manages the volume using clone files to conserve space. There are frequent new versions of APFS and I suspect that this is an area where the APFS file system is tweaked.
It may be as you say that all the data is written, and the APFS filesystem does some magic on the receiving volume. But how could APFS "know" the vast majority of the data is duplicate, I wonder.

I have an alternative guess. In APFS, a file consists of one or more "extents" (I think), which contain the data. If a file is currently in a snapshot and some addition or change is made to the file, the new/changed bytes are written to a new "extent" (I think). (Thus the old version can still be recovered from the snapshot.) So, if that's right, maybe it's not hard for CCC (and probably TM) to detect that the source file and the already-backed-up version are nearly the same (all the other extents are the same), and just copy the new "extent" to the backup volume's version of the file. However, this guess requires that the previous backup's snapshot still exists on the source volume, and I don't think this is really the case.

If we knew the count of bytes written during your second backup (for example), we could say which guess is closer to the actual behavior. I might get around to giving that a try...

We both have to peer into the darkness.
Indeed!
 
  • Like
Reactions: gilby101
It would definitely be interesting to see if a 1GB+ Photoshop file generates a second 1GB+ backup when adding an extra layer, for example.
Yes, that sounds like a use case where the feature would make a big difference! I remember with the old TM to HFS+, having entire disk images backed up again after minor changes (using sparsebundles sort of works around that). And just renaming a file or directory would result in a huge backup.

BTW, I recently received an email notice about the release of CCC version 7. One of the new features says:
In the past, when a folder was renamed on the source, the folder would be removed from the destination and recopied. Now CCC's file copier can smartly detect renamed folders and simply rename those folders on the destination.
Nice! And I've long wanted to be able to add notes to specific backups in TM... now CCC allows that:
Tag snapshots as "permanent" so they won't be removed by CCC's thinning and pruning, and at the same time you can add a note to the snapshot to add some context (e.g. "Created prior to upgrading to Sonoma")
(Unfortunately I can't use version 7 because I'm still on Monterey :-( and can't do an Apple-approved upgrade, as Apple says my Late 2015 iMac is too old.)

It sure would be nice if TM got a few new features, too.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.