I heard that an Apple engineer stated they had done some firmware tricks which may or may not affect DIY (con)fusion drive setups. Most people seem to be saying this works at the block level but I can't help wondering if they wouldn't have had more reasons to go for a file level solution. If anyone can expand on these points or share their experiences with target disk mode it would certainly be great to know more.
A file level solution is actually not possible without major changes in the operating system.
Let's say you have a file containing 1000 blocks of data. The operating system obviously records where the file is, say "block 13,487,934 to 13,488,933". That would be the easiest case, where the file is stored in consecutive blocks. But most of the time that doesn't happen, so the operating system might now that the file is at "600 blocks 13,487,934 to 13,488,533, plus 400 blocks 9,123,492 to 9,123,891". A lower level then knows about the partitions of your hard drive. If that whole filesystem is on the second partition of your hard drive, for example, that lower level knows the partition starts at block 300,000,000 and adds 300,000,000 to any block number that it is told to read or write for that partition.
Now comes fusion: It combines two volumes into one partition at the lower level. So when that lower software level is told to read or write data by block number, it checks where that block is, and reads/writes either the SSD or the hard drive. To the upper levels of the file system, this is entirely invisible. The upper level of the file system doesn't know and need not know about this for everything to work correctly. Now if you have a file that uses more than one range of blocks, some of these can be on the SSD, some on the HD, and it doesn't matter at all. It is actually possible that one range of blocks is split half between SSD and HD, and even that works just fine.
The other thing that fusion does is that it decides what kind of blocks are better for a file or for part of a file. So it says "I've been reading the first ten blocks of this file quite a lot, so these blocks should be on the SSD drive" and then moves them, and records where they were moved.
Doing this on the file level only would be difficult. Let's say I have a few big files on the SSD. Then I add more data to each of them until the SSD is absolutely full. Then I add some more data. With fusion, no problem. There is plenty of free space on the HD, so that data goes to the HD (Fusion may rearrange things later, but at the moment it is fine on the HD). If Fusion worked exclusively on a file level, then this wouldn't be possible. This might be a 10 GB file, you add a block, and Fusion would have to copy the whole 10 GB to the hard drive right then. Instead, it just puts a few blocks into a different place, which is something the OS has been doing just fine for the last twenty+ years.
----------
That is an awesome link. Thank you. The Xbench results of the Fusion Drive are very impressive, about the exact same as my Samsung 830 128 GB in my 2011 mini, so there is no drop off in performance. I'm definitely going to do the Fusion Drive mod on both my 2011 mini and my 2012 MBP.
Just saying: Benchmarks for this kind of thing are really, really difficult to do properly. Benchmarks usually repeat the same test all over, and then of course Fusion will look very good (but then users often do the same things all over, so Fusion works well for them). Or the benchmark writer notices this and says "well, the benchmark looked good, but that was because I read and wrote 10 GB of data only, so I do the same thing with 200 GB". The benchmark slows down, but maybe because it is doing things that _you_ don't do.