Basically Dan Matt is a silly man.
- The CPU is probably running at 2.6GHz (based on SGEMM number). This is about 10% boost.
- The IPC improvements this time round are indeed about 15%.
- So total is very much like the A7 to A8 transition. And that was hardly the end of Apple's CPU improvements...
I think the main thing to be said about TSMC's 10nm (and probably 7nm) is that Apple and TSMC are barreling into them so fast that they don't have time to optimize for them.
Is this a catastrophe? Not really. Even though power and frequency are not being optimized, density has been dramatically improve (~2x). The extra transistors in turn are being used to give us that 15% IPC boost, higher throughput in the companion cores, Apple's (50% faster) GPU, and the NPU.
What are those companion cores for? Not, IMHO, for boasting to Android owners. Nor for some vague "OSs run lots of threads" BS, the sort of thing spouted by idiots who appear to be unaware of the concept of multitasking. The best hypothesis I have so far is that the companion cores (together with the Apple GPU) can be used as a ray tracing engine. We might see API for this next year, but we probably won't see a UI based on this for another three years or so --- need time to move the hardware to the majority of the population...
The most interesting thing about this CPU is the total reworking of the uncore. The memory controller is SPECTACULAR, superior to anything Intel ships. The separate L2's suggest a future of multiple large cores (which primarily makes sense in the context of desktop [4 to 6 cores?] and server; even for iPad 3 cores + companion is good enough for the near future). Basically to me this looks like the start of Apple's move from "best mobile cores in the world" to "best desktop cores in the world" very soon (and shortly after that, "best server cores in the world" though that step will require new work that Apple hasn't done yet like a spectacular coherent fabric).
For more details you can read my analysis here et seq: (why I believe 2.6GHz, why I believe the companion cores are each worth about a quarter of a full core, and what contributors to IPC appear to have been improved)
https://www.realworldtech.com/forum/?threadid=171572&curpostid=171572
BTW it's worth noting that on every sub-benchmark now except SGEMM and SFFT, Apple has higher (sometimes substantially higher) IPC than Intel. Both are probably due to Intel having 2xAVX256 whereas Apple has 3xNEON128. (Apple could fix this by adding a 4th NEON unit, like the added a 3rd to the A9, but more likely
- that type of code is best run on the GPU, so they don't really care
- at some point they may implement SVE)