A8x CPU performance is almost double that of the A8
edit: sorry i meant to say in the past ipad chips maintained largely the same CPU performance as the iphone counterpart.
Not even close.
The A7 to A8 CPU (single threaded) performance jump is about 25% split about equally three ways between
- Compiler improvements (so not REALLY a CPU improvement, because the A7 gets the same 8% speed boost when things are recompiled)
- 100MHz speed increase
- very slightly improved/tweaked design. (The most obvious parts of this are
= second integer multiplication unit
= FPU add/multiply decrease latency by one cycle
= L3 cache substantially decreases latency).
A8X runs 100MHz faster than the A8, so it gets an additional 8% or so from that. An honest accounting (ignoring compiler speedups) would be that single threaded A8X is about 25% faster than A7.
Of course A8X has that third core which is very nice for maintaining a responsive UI in some especially demanding situations where many things are happening at once, but let's not go the Android route of pretending that more cores makes everything faster. Today that's just not the case for most apps in most situations. Every year an additional core becomes a little more useful --- the OS does a little more offloading from the main thread, apps work a little harder to parallelize themselves --- but we're by no means at the point where most software that needs it is making aggressive use of parallelism.
It's very tempting to enthuse about how super fast the A9 will be. I'm not sure that's the right mindset. Apple is not losing any sales from people saying "I really want an iPhone/iPad but they're just too darn slow", and that's not likely to change next year. The first crop of A57 devices will probably lag an A9 that's just a process tweaked A8. nVidia's Denver may be a worthy competitor; but only a competitor --- better along some dimensions, worse along others.
What we MAY see in the A9 is the ARM8.1 instruction set modifications, in particular the RMW atomics which should allow for faster reference count updates of objects shared between threads. (It's not clear if Apple was the one who wanted these ISA changes or one of the ARM server vendors, but if Apple implements them, it should help them to be somewhat more aggressive in multithreading Cocoa.)
What we MAY see in the A9 is the mythical Apple designed (perhaps based on an existing Imagination design) GPU. Alternatively Apple may feel that the current deal they have with Imagination (which appears to allow for taking existing Imagination designs and scaling them further that what Imagination sells to anyone else) is good enough.
What we MAY see in the A9 is the long-awaited HSA architecture which has CPU and GPU sharing a coherent address space, along with the OS allowed much more control of how individual GPU cores are scheduled.
All three of these are more subtle than just "it's 30% faster", but they are the natural next steps for Apple, if not now than over the next three years or so, and they will all contribute to Apple being once again a year or more ahead of the competition. As far as the user goes, they will all contribute to "snappiness" rather than higher benchmark numbers, which will make a few fanboys sad, but should make the customer base as a whole happier.
Which means I expect the FinFets and smaller geometry of the new process to be used much like the A7 to A8 transition --- some more transistors to support the functionality I'm described, a minor speed boost from tweaks and maybe an extra 100MHz, and another halving or so of energy used. I expect Apple to stick with the basics of the existing (quite satisfactory) design while they add the 8.1 instructions and HSA and only work on a more radical design (at which point they drop 32-bit instruction support and go for more aggressive performance) once 8.1 and HSA are implemented and fully understood.
We'll see how right I am in a few months...