Why isn't the M1 Ultra twice as fast as the M1 Max? I thought the Ultra was 2 Max chips stuck together.
This is an interesting question, and depends on the exact aspect of performance.
If you have INDEPENDENT pieces of code running on the CPUs, for example, the Ultra is essentially twice as fast as the Max/Pro (look at eg the GB5 multicore score).
But when the code running on different cores (CPU or GPU) needs to interact (eg one core needs to wait until another core has finished its work) multiple issues kick in.
- The most obvious is that everyone has to wait for the slowest piece of work (you may have 16 tailors, but the suit isn't ready until the slowest tailor working on the most complicated part is finished).
- More technical (but probably more important) is that it's not easy for one core to communicate with another at high speed. There is a lot of protocol overhead (exactly what info needs to be communicated, in what order), and a lot of HW overhead (to get from a GPU core on one chiplet to a GPU core on the other, a transaction has to pass through multiple routers that decide where next to send the transaction, along with delays in buffers that match different voltages or different frequencies between different IP blocks).
Also the LOCAL connections between the GPU cores also start to get clogged once you have too many cores all trying to talk to each other, and you need to build a "second freeway" to prevent these sorts of traffic jams.
Even Max starts to suffer from this. M1 and M2 Pro Metal scores are about 2x M1 and M2, but the Max score is only about 1.65x the Pro score. Then Ultra is about 1.45x the Max score.
Apple is well aware that scaling on Ultra (and even Max) was sub-optimal, and I think they shipped Ultra essentially as an experiment (even more so than the rest of the M1 line) to see what the most serious pain points were. If you look at the patent record, they have already patented both a new cache protocol and a new Network on Chip that are designed for scalability across multiple chiplets, and are informed by what they learned from Ultra.
Of course who knows when patents will turn into products, but I suspect that we will will see a rather better scaling at the high end (both Max and Ultra) in the next generation.