"32 bit backwards compatibility is built into ARMv8"
This is a legal fiction. It is not something inherent in the design of the ARMv8 ISA which, in fact, almost goes out of its way to be very different from the ARMv7 ISA.
There are other legal fictions in ARM world. Do we believe that Swift bothers to implement Jazelle, or PAE, or the hypervisor stuff?
Apple is not like any other ARM manufacturer. They don't have to adhere to the standard "ARM contract" between manufacturer and developer, they can define their own contract. And if that contract says "After July 2014 any app that does not include a 64-bit binary will be removed from the app store", so be it; just like if the contract says "we've now added these new 256-bit SIMD instructions", again so be it.
Everyone analyzing this is looking at it from some sort of weird 1995 perspective, where the resource in short supply is transistors, and the optimal design shares as much as possible between the 64-bit processing and the 32-bit processing. But it is NOT 1995 --- we have a billion transistors to play with. The resource in short supply is ENGINEERING TIME. It is far easier to design a pure 64-bit core (with some, possibly slow, transition logic to transfer control to a 32-bit companion core) than it is to design a 64-bit core that can ALSO execute the 32-bit ISA.
We're not talking here about creating the sort of weirdness that x86 seems to thrive upon: "what if I want to jump from 64-bit code to a plugin that is running THUMB16 code?, what if I want to share pointers between 64-bit and 32-bit code? etc". We're talking iOS --- an extremely controlled execution environment. The only transitions that are necessary are 64-bit user -> 64-bit OS->32-bit user and back again, and these are not going to happen frequently --- worst case really at around tick frequency.
You can submit 64-bit apps for iOS 7 today that take advantage of the power of iPhone 5s. Xcode can build your app with both 32-bit and 64-bit binaries included so it works across all devices running iOS 7. If you wish to continue to support iOS 6 then you will need to build for 32-bit only. Next month we will be making changes that will allow you create a single app binary that supports 32-bit on iOS 6, as well as 32-bit and 64-bit on iOS 7.
The Ivy Bridge HE-4 is made on the Intel (INTC) 22nm Trigate process and has 1.4 billion transistors with a chip size of 160 sq. mm, so a billion transistor logic chip on 22nm should be around 114 sq. mm. "Hand packing" the A7 design might give another 10% size improvement, which would make the A7 chip almost exactly 102 sq. mm. on the Intel 22 nm process.
Curious to see what you think of this analysis: http://seekingalpha.com/article/1694892-intel-and-apple-an-interesting-tidbit
Author makes a good case that the A7 is done in Intel, with a 22nm process.
Curious to see what you think of this analysis: http://seekingalpha.com/article/1694892-intel-and-apple-an-interesting-tidbit
Author makes a good case that the A7 is done in Intel, with a 22nm process.
Anandtech guesses Samsung 28nm HK+MG process, fairly certain of 1.3 GHz, sure of two cores.
anyone know what "HK+MG" is?
Time to update prediction from this:
A7
- Manufacturer - Samsung on HKMG 28nm process
- Die Size - 90-120 mm2
- Designer - Apple
- CPU Type - 1.3-1.6GHz Dual Core "Second Generation Swift Core"
- Instruction Set - ARMv7s
- Chip Designator - S5L8960X
- L1 Cache - 32/32KB
- L2 Cache - 1MB
- RAM - 1GB LPDDR3 @ 1333 MHz (64 bit interface, PoP)
- Max Theoretical Memory Bandwidth - 10.6 GB/s
- GPU Type - "Quad Cluster" PowerVR 6430 @ 400 MHz
- GPU Performance - 102.4 GFlops, 233 MTriangles/s
Bold is confirmed by Apple. Italics is changed from above prediction.
A7
- Manufacturer - TSMC on HKMG HPM 28nm process
- Die Size - 102 mm2
- Transistors - approximately 1 billion
- Designer - Apple
- CPU Type - 1.5-1.8GHz Dual Core "Second Generation Swift 64-bit Core"
- Instruction Set - ARMv8 with custom extensions
- Chip Designator - S5L8960X
- L1 Cache - 48/32KB
- L2 Cache - 2MB
- RAM - 1GB LPDDR3 @ 1333 MHz (64 bit interface, PoP)
- Max Theoretical Memory Bandwidth - 10.6 GB/s
- GPU Type - "Quad Cluster" PowerVR 6430 @ 270 MHz
- GPU Performance - 69.2 GFlops, 157 MTriangles/s
See my explanation for these prediction changes over here: https://forums.macrumors.com/threads/1634100/
edit: the first GPU benchmark from 5S has been uploaded, supporting 2x claim.
Brian Klug noticed someone uploaded a graphics bench from 5S.
https://twitter.com/nerdtalker/status/377847764300099586
http://gfxbench.com/device.jsp?benchmark=gfx27&D=Apple+iPhone+5S&testgroup=overall
GFXBench 2.5 Egypt HD C24Z16 - Offscreen (1080p) : 56 FPS
GFXBench 2.5 Egypt HD C24Z16 - Onscreen : 53 FPS
Compared to iPhone 5:
GFXBench 2.5 Egypt HD C24Z16 - Offscreen (1080p) : 29.8 FPS
GFXBench 2.5 Egypt HD C24Z16 - Onscreen : 41.1 FPS
At its launch event Apple claimed the A7 offered desktop class CPU performance. If it really is performance competitive with Bay Trail, I think that statement is a fair one to make. We're not talking about Haswell or even Ivy Bridge levels of desktop performance, but rather something close to mobile Core 2 Duo class.
We are talking about a phone with the same sort of CPU power as a laptop from only a few years ago. That is truly amazing.
I thought were at the point where specs don't really matter it just needs to work!Looks like I never posted my itemized prediction from the other thread in this thread. I'll post it below, quoted. It includes my pre sept 10 event prediction and my revised prediction after the event. After the quoted post, I'll update with the latest known form the anandtech review and explain why I got some stuff wrong, and what's still left to be known yet.
Updated prediction based on anandtech benches and review (bold now means confirmed by Apple or Anandtech):
- Manufacturer - Samsung on HKMG 28nm process
- Die Size - 102 mm2
- Transistors - approximately 1 billion
- Designer - Apple
- CPU Type - 1.3GHz Dual Core "Cyclone" 64-bit
- Instruction Set - ARMv8
- Chip Designator - S5L8960X
- L1 Cache - 64/64KB
- L2 Cache - 1MB
- RAM - 1GB LPDDR3 @ 1333 MHz (64 bit interface, PoP)
- Max Theoretical Memory Bandwidth - 10.6 GB/s
- GPU Type - "Quad Cluster" PowerVR 6430 @ 270 MHz
- GPU Performance - 69.2 GFlops, 157 MTriangles/s
The first thing I got wrong according to Anandtech is the foundry. Anand Shimpi believes it's still Samsung, whereas I predicted TSMC. I still think this one is up in the air given the mystery identifier on the leaked A7 PCB that doesn't match Samsung naming. We'll see.
The second thing I got wrong was the core speed and core count. My original prediction before the event was a dual core still based on the ARMv7-A ISA like Swift was. I predicted a modestly higher clock at about 1.5 GHz for moderately higher performance. After the event gave me the die size and number of transistors along with claimed performance improvements, I assumed that quad core was needed for their claimed transistor count and performance metric. 2x would require a very high IPC bump without much room for a clock bump. I then revised back to dual core after analyst Ben Bajarin informed me that he had a source saying dual core for sure. For all I know, Anand was the source
My updated cache prediction was based on what we know about the stock A57 core from ARM. It looks like Apple opted to go a little higher and do a full 64KB instruction and data cache. Anand notes that it's a little slower to access L1 cache now, but the hit rate (successful finds of desired data) has improved because of the larger cache size. L2 cache latency has greatly improved, as has latency to main memory. He also saw up to 2x memory bandwidth in some synthetic cases. This huge improvement to the memory front-end is not all that surprising because it's one of the main improvements Qualcomm implemented in between Krait revisions, which Swift was very similar to.
The design is also out-of-order as we've had since the Cortex A9 in the Apple A5. Branch prediction is up too. All these improvements help IPC.
He also notes that the RAM is indeed 1GB as predicted, and that's it's also very likely LPDDR3 as predicted. Speed is unknown as of yet, but the silkscreen markings on the A7 package from the teardown will confirm that for us.
Getting to the benchmarks, it's pretty clear even 32 bit binaries benefit from the beefed up execution resources too. The A7 allows 32 and 64 bit binaries to run side-by-side transparently, which will make it easier on developers. Anand also notes that the increase in register count alone gave up to 10% performance boost on the x86 64 bit transition because of reduced data pressure.
Many of the benchmarks hit the claimed 2x performance metric set by Apple. It's clear that the improved memory subsystem, improved core execution units, increased register count and move to 64 bit binaries have indeed increased the IPC to a realm that hits near Apple's line in the sand. It's very impressive when you consider that it's still only dual core and it's still only 1.3 GHz.
On clock speed, Anand notes that device manufacturers are clocking ARM cores higher than they intended, and they're paying for it in applied core voltages. They're compensating with much larger batteries, too.
It's interesting to note that Anand says no one in the mobile space has done variable clock frequency quite right yet, so it will be interesting to see where Apple goes in the future there. I'm also anxious to see if they've updated the power management IC (PMIC).
The GPU looks like a spot on hit. It was kind of easy to guess given it was likely to be a stock choice on Apple's part and there were realistically only 2 choices given reasonable clock speeds. I favored larger and slower given Apple's history and that turned out to be right.
For transistor density, Anand notes that there's clearly some design process overhaul leading to them packing transistors so tightly. The large G6430, which is inherently more dense because of its resource breakdown compared to 5XT GPUs, helps them achieve their density increase. The slightly larger caches help ever so slightly too.
For the 1 billion transistor claim, Brian Klug from anandtech thinks they're counting M7 too. I don't because they announced this number before talking about the M7 at all. I think there's likely a significant amount of resources spent on the secure enclave for TouchID and also for a dedicated buffer/image signal processor to handle the advanced 5S camera functions.
All in all, A7 is a very impressive update and even iPhone 5 users will feel the speed increase in their everyday tasks.
Next we need the teardown to possibly confirm manufacturer and to confirm the RAM type. After a chipworks die shot, we'll know foundry for sure and figure out how all of those transistors are being spent.
I thought were at the point where specs don't really matter it just needs to work!