Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Good post.

I will be stunned if they can get a 50% general IPC improvement and hit the battery life they claim. I am expecting the 2x figure to be radically specific scenarios where new instructions allow certain operations to be greatly accelerated. For the A6, they provided a variety of speed up scenarios. This time, they just gave us the generic 2x.

It was such an odd thing for them to quote the register increase though. That's a lot more technical than they ever get.
 
Indeed, hard to imagine such an IPC boost in mobile envelope. But from architectural point of view certainly not unattainable, given Cortex-A57 attacking 5 DMIPs/MHz, it's all within range. Could they do something radical wrt power management to make that possible on mobile? Like on-demand clock/power gating of some core resources directed from software/compiler?

It was oddly detailed for a keynote, but not really revealing at the same time. By the time they spell out 64 bit, you know there's gonna be twice the registers. Perhaps they tried to make impression on people bringing up technical terms to prove their expertise or make them look unique? Or they just needed more figures they can claim to have doubled :)

You're right that simply stating 2x performance unlike the typical scenario breakdown sounds a little fishy. Could be just a particular synthetic bench - perhaps the FPU performance that improves in ARMv8? Time will tell, but it's certainly looking at least like a 50% speed bump worst case and 100% best case.
 
The extra registers might net you an extra 5 percent on many tasks. Maybe 15 percent on a few tasks. But they wouldn't get you anywhere close to twice the performance.

Then why are developers who've been able to use the actual iPhone 5s claiming real world loading times of their games improving 5X?
 
Indeed, hard to imagine such an IPC boost in mobile envelope. But from architectural point of view certainly not unattainable, given Cortex-A57 attacking 5 DMIPs/MHz, it's all within range. Could they do something radical wrt power management to make that possible on mobile? Like on-demand clock/power gating of some core resources directed from software/compiler?

I've seen it speculated that they don't do DVFS to the level that Qualcomm does for Krait. If that is the case, surely they do that by now with the A7.

I don't know that I've heard of power gating specific execution resources, though. I'm sure there's a point where granularity with regards to power gating starts adding to much complexity to layout and it's no longer worth the trouble.

It was oddly detailed for a keynote, but not really revealing at the same time. By the time they spell out 64 bit, you know there's gonna be twice the registers. Perhaps they tried to make impression on people bringing up technical terms to prove their expertise or make them look unique? Or they just needed more figures they can claim to have doubled :)

I'm guessing it was more directed at the people who would doubt them or accuse their implementation of being rushed or half assed. By quoting the register count, they prove it's a true 64 bit move. By quoting the transistor count and die size, they show that they've made a HUGE step in complexity over what was already an impressive design with Swift. That also shows off that they've achieved a very impressive transistor density - something normally seen only in GPUs or similar circuits.

You're right that simply stating 2x performance unlike the typical scenario breakdown sounds a little fishy. Could be just a particular synthetic bench - perhaps the FPU performance that improves in ARMv8? Time will tell, but it's certainly looking at least like a 50% speed bump worst case and 100% best case.

Yes, my guess would be some vector operations that got a huge boost. I originally guessed quad core because I gave them the benefit of the doubt on an across-the-board IPC improvement that gets them to 2x. With Ben Bajarin claiming to have a source that tells him it's dual core, they've either really cranked up the clock rate to un-Apple-like frequencies or made the execution and issue paths extremely wide for these cores. The real explanation is that it's probably a combination of all of these. Wider paths, faster clock, limited scenarios for 2x (enhanced with ISA extensions), and maybe some new compiler optimizations for good measure.

Then why are developers who've been able to use the actual iPhone 5s claiming real world loading times of their games improving 5X?

Probably a lot of things going on at once. With their purchase of Anobit, they gained NAND controller expertise. Speeding their NAND read rates is the easiest way to drop load times. Widening/speeding up the DRAM interface is another way (and memory hierarchy improvements in general). If we're talking about loading environments, some back-end improvements like compression methods could come into play.

At some point in the next few years they'll surprise us with 3D-ICs that give them huge width memory interfaces and really send the DRAM bandwidths through the roof.
 
It's a bit out of date now. The post titles now no longer matches the CPU section conclusions, which still talk about quad core. We have an industry analyst now citing a source as reason for it being dual core. That changes the game because it forces us into much more complex cores and/or higher clock speeds.

Since you have made an account on The Verge, I will delete the original post. You can repost it and have full control of your thread.

Let me know.
 
Then why are developers who've been able to use the actual iPhone 5s claiming real world loading times of their games improving 5X?

There's no way the ARMv8 arc by itself is responsible for the doubling of performance on the A7.
 
Since you have made an account on The Verge, I will delete the original post. You can repost it and have full control of your thread.

Let me know.

I already had the account. I just wanted to post to clarify. I don't want to spend the time to maintain multiple places of discussion, but I don't mind my content there.
 
Then why are developers who've been able to use the actual iPhone 5s claiming real world loading times of their games improving 5X?
Developers as in just Chair during keynote?

That presentation was worked very closely by Apple. They would have massaged the presentation using keywords they wanted highlighted.

It could have faster flash memory, RAM, interfaces, a number of things that contribute.
 
There's no way the ARMv8 arc by itself is responsible for the doubling of performance on the A7.

You're right. That's why apple has put so much engineering work into their new chip. Apple isn't just buying and slapping together reference designs, they're using a modified design done in house.

Great article on the A7 roadmap: http://appleinsider.com/articles/13...sung-may-have-lost-apples-a7-contract-to-tsmc

----------

Developers as in just Chair during keynote?

That presentation was worked very closely by Apple. They would have massaged the presentation using keywords they wanted highlighted.

It could have faster flash memory, RAM, interfaces, a number of things that contribute.

These were interviews with the developers, not the keynote.

http://www.polygon.com/2013/9/11/4720214/infinity-blade-3-developer-interview-iphone-5s
 
Interesting thread, but just wait until you know exactly what it will come with to give detail info.
 
Whoa did they really quote you? I read it but not at a granular level.

Yes. They quoted me on transistor density speculation and GPU speculation. The first part is outdated baed on my cursory research.
 
I'm not crazy about this analysis. I think it ignores the various ways in which the mobile world is different from the desktop world, and the ways in which Apple cares about real-world behavior rather than specs.

Here's an alternative analysis:
(a) the extra transistors are used in companion cores. Rather than designing a 64-bit core than can also handle the 32-bit ISA (with all the hassle and pain that implies) Apple just designs a clean 64-bit core. So how do they handle 32-bit? By gluing a Swift core onto the design, and using big.LITTLE technology to swap between the two. They can even tailor the Swift to be as low power as possible (rather than high-performance) as a way of both reducing power and compelling developers to move to 64-bit faster --- as long as you stick with 32-bit code only you'll be stuck on the slow core.

This gives them a way to feel out the strengths and weaknesses of companion core technology (which is a technology that DOES make sense for saving power) while actually using it to solve a more immediate problem (32-bit backward compatibility). They don't boast about it (either as "quad-core CPU" or as "OMG lower power") because it's not optimized for either task, so why make a big deal --- make the big deal in two years or so when they ditch 32-bit bwd compatibility and use a custom companion core that is very much optimized for saving power.

(b) For MOST purposes, the performance that matters in a phone is bursts of peak performance, not marathons of ongoing performance. Which means that the ideal phone core is designed like, say, Haswell, able to turbo up to higher speeds for short amounts of time. This particular design dimension has not been much exploited yet by ARM, but it makes perfect sense. It is possible that Apple is already exploiting it to some extent with the A7. In other words, it gets its 2x by, on occasion, yes being able to turbo up to 2.1GHz, just for a second or less to parse that web page or decode that PDF.

This transition while doubtless result in a massive amount of whining and claims about "cheating" --- we saw the same idiocy and complaints when Intel first got serious with their turbo support. Yes, it means that peak numbers are not representative if you plan to run a game or some other long-running app. c'est la vie. It's a change that improves things substantially for MOST users, and it's a substantial part of the future, so get used to it.
 
splitting cores into 32-bit and 64-bit would be a waste of resources, ARMv8 is fully backwards compatible with older 32-bit

http://www.arm.com/products/processors/armv8-architecture.php

frequency and voltage scaling is a common practice

reason for companion cores is that they are manufactured on low power low performance process, that's why they can't be integrated onto the main SoC

the reason Motorola has a few such companion cores is because they are actually just an off-the-shelf microcontrollers and DSPs from Texas Instruments and nothing custom designed by Motorola, whereas M7 will likely be an Apple designed chip probably using something like Cortex-A5 or even just Cortex-M4 MCU/DSP
 
(b) For MOST purposes, the performance that matters in a phone is bursts of peak performance, not marathons of ongoing performance. Which means that the ideal phone core is designed like, say, Haswell, able to turbo up to higher speeds for short amounts of time. This particular design dimension has not been much exploited yet by ARM, but it makes perfect sense. It is possible that Apple is already exploiting it to some extent with the A7. In other words, it gets its 2x by, on occasion, yes being able to turbo up to 2.1GHz, just for a second or less to parse that web page or decode that PDF. .
ARM already has frequency scaling. So does the A6.
 
Great points made by others. I'd also add that big.LITTLE configurations require for all cores to support the same ISA. That wouldn't work here, so it would have to be a completely custom solution. It would likely have an OS overhead too.
 
Great points made by others. I'd also add that big.LITTLE configurations require for all cores to support the same ISA. That wouldn't work here, so it would have to be a completely custom solution. It would likely have an OS overhead too.

Which means it's an even better solution for Apple, no? That much harder for everyone else to copy, but a perfect match if you own the whole stack, from HW design to the developer tools to the OS. Solves APPLE's problem very well (get to 64 bit, maintain 32-bit compatibility for while, but don't waste too many engineering resources looking backwards), without helping anyone else.

As for DVS, of course ARM has had DVS for a while. But they have not (as far as I know) pushed it the way Intel has. They have used it to scale frequency down from a "preferred" operating point, but I'm unaware of ARM CPUs that use DVS to push frequency ABOVE the "preferred" operating point for brief periods, either because running too long at that frequency overloads thermals, or because it would harm the battery with too extreme a current drain.
 
Which means it's an even better solution for Apple, no? That much harder for everyone else to copy, but a perfect match if you own the whole stack, from HW design to the developer tools to the OS. Solves APPLE's problem very well (get to 64 bit, maintain 32-bit compatibility for while, but don't waste too many engineering resources looking backwards), without helping anyone else.

As for DVS, of course ARM has had DVS for a while. But they have not (as far as I know) pushed it the way Intel has. They have used it to scale frequency down from a "preferred" operating point, but I'm unaware of ARM CPUs that use DVS to push frequency ABOVE the "preferred" operating point for brief periods, either because running too long at that frequency overloads thermals, or because it would harm the battery with too extreme a current drain.

I think you're missing others' points. 32 bit backwards compatibility is built into ARMv8. They would have to purposely strip it out, with not much complexity saved most likely, only to add a whole other core to replicate its functionality. Doesn't make sense. Apple has offered band-aids to non-updated apps before (iPhone 4S and back apps were centered on iPhone 5). They're not just going to drop the floor out.

And adding a second 32 bit compatible core for backwards compatibility isn't a competitive advantage. It's a band-aid.
 
I think you're missing others' points. 32 bit backwards compatibility is built into ARMv8. They would have to purposely strip it out, with not much complexity saved most likely, only to add a whole other core to replicate its functionality. Doesn't make sense. Apple has offered band-aids to non-updated apps before (iPhone 4S and back apps were centered on iPhone 5). They're not just going to drop the floor out.

And adding a second 32 bit compatible core for backwards compatibility isn't a competitive advantage. It's a band-aid.

"32 bit backwards compatibility is built into ARMv8"
This is a legal fiction. It is not something inherent in the design of the ARMv8 ISA which, in fact, almost goes out of its way to be very different from the ARMv7 ISA.
There are other legal fictions in ARM world. Do we believe that Swift bothers to implement Jazelle, or PAE, or the hypervisor stuff?
Apple is not like any other ARM manufacturer. They don't have to adhere to the standard "ARM contract" between manufacturer and developer, they can define their own contract. And if that contract says "After July 2014 any app that does not include a 64-bit binary will be removed from the app store", so be it; just like if the contract says "we've now added these new 256-bit SIMD instructions", again so be it.

Everyone analyzing this is looking at it from some sort of weird 1995 perspective, where the resource in short supply is transistors, and the optimal design shares as much as possible between the 64-bit processing and the 32-bit processing. But it is NOT 1995 --- we have a billion transistors to play with. The resource in short supply is ENGINEERING TIME. It is far easier to design a pure 64-bit core (with some, possibly slow, transition logic to transfer control to a 32-bit companion core) than it is to design a 64-bit core that can ALSO execute the 32-bit ISA.

We're not talking here about creating the sort of weirdness that x86 seems to thrive upon: "what if I want to jump from 64-bit code to a plugin that is running THUMB16 code?, what if I want to share pointers between 64-bit and 32-bit code? etc". We're talking iOS --- an extremely controlled execution environment. The only transitions that are necessary are 64-bit user -> 64-bit OS->32-bit user and back again, and these are not going to happen frequently --- worst case really at around tick frequency.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.