The History of Apple SoCs To know where we're going, we need to know where we came from. Prior to the A4, Apple sourced Samsung SoCs for the iPhone, iPhone 3G and iPhone 3GS. Let's take a look at Apple's custom SoCs. A4 A4:  Manufacturer - Samsung on 45nm process (as featured in iPhone 4) Die Size - 53 mm2  Designer - Apple (Intrinsity, also featured in Samsung's 'Hummingbird' SoC) CPU Type - 800MHz Single Core (as in iPhone 4) Instruction Set - ARMv7 Chip Designator - S5L8930X L1 Cache - 32/32KB L2 Cache - 512KB RAM - 512MB LPDDR @ 400 MHz (as in iPhone 4, 64 bit interface, PoP) Max Theoretical Memory Bandwidth - 3.2 GB/s  GPU Type - Dual Core PowerVR SGX 535 @ 200 MHz GPU Performance - 1.6 GFlops, 14 MTriangles/s  A5 A5:  Manufacturer - Samsung on 45nm process (as featured in iPhone 4S) Die Size - 122.2 mm2 Designer - Apple CPU Type - 800MHz Dual Core Instruction Set - ARMv7 Chip Designator - S5L8940X L1 Cache - 32/32KB L2 Cache - 1MB RAM - 512MB LPDDR2 @ 800 MHz (64 bit interface, PoP) Max Theoretical Memory Bandwidth - 6.4 GB/s  GPU Type - Dual Core PowerVR SGX 543 @ 200 MHz GPU Performance - 14.4 GFlops, 70 MTriangles/s  A5X:  Manufacturer - Samsung on 45nm process (as featured in "The new iPad") Die Size - 165 mm2 Designer - Apple CPU Type - 1GHz Dual Core Instruction Set - ARMv7 Chip Designator - S5L8945X L1 Cache - 32/32KB L2 Cache - 1MB RAM - 1GB LPDDR2 @ 800 MHz (128 bit interface, off package) Max Theoretical Memory Bandwidth - 12.8 GB/s  GPU Type - Quad Core PowerVR SGX 543 @ 250 MHz GPU Performance - 36 GFlops, 175 MTriangles/s  A6 A6:  Manufacturer - Samsung on HKMG 32nm process Die Size - 96.71 mm2 Designer - Apple CPU Type - 1.3GHz Dual Core Instruction Set - ARMv7s Chip Designator - S5L8950X L1 Cache - 32/32KB L2 Cache - 1MB RAM - 1GB LPDDR2 @ 1066 MHz (64 bit interface, PoP) Max Theoretical Memory Bandwidth - 8.5 GB/s  GPU Type - Triple Core PowerVR SGX 543 @ 320 MHz GPU Performance - 34.6 GFlops, 168 MTriangles/s  A6X:  Manufacturer - Samsung on HKMG 32nm process Die Size - 123 mm2 Designer - Apple CPU Type - 1.4GHz Dual Core Instruction Set - ARMv7s Chip Designator - S5L8955X L1 Cache - 32/32KB L2 Cache - 1MB RAM - 1GB LPDDR2 @ 1066 MHz (128 bit interface, off package) Max Theoretical Memory Bandwidth - 17 GB/s  GPU Type - Quad Core PowerVR SGX 554 @ 280 MHz  GPU Performance - 80.6 GFlops, 196 MTriangles/s  Apple A-series family tree:  Functional block size allocation on die:  A7 Prediction A7 Manufacturer - Samsung on HKMG 28nm process Die Size - 90-120 mm2 Designer - Apple CPU Type - 1.3-1.6GHz Dual Core "Second Generation Swift Core" Instruction Set - ARMv7s Chip Designator - S5L8960X L1 Cache - 32/32KB L2 Cache - 1MB RAM - 1GB LPDDR3 @ 1333 MHz (64 bit interface, PoP) Max Theoretical Memory Bandwidth - 10.6 GB/s GPU Type - "Quad Cluster" PowerVR 6430 @ 400 MHz GPU Performance - 102.4 GFlops, 233 MTriangles/s A7X Manufacturer - Samsung on HKMG 28nm process Die Size - 110-150 mm2 Designer - Apple CPU Type - 1.4-1.7GHz Dual Core "Second Generation Swift Core" Instruction Set - ARMv7s Chip Designator - S5L8965X L1 Cache - 32/32KB L2 Cache - 1MB RAM - 1GB LPDDR3 @ 1333 MHz (128 bit interface, off package) Max Theoretical Memory Bandwidth - 21.5 GB/s GPU Type - "Hex Cluster" PowerVR 6630 @ 400-600 MHz GPU Performance - 153.6-230.4 GFlops, 350-525 MTriangles/s A7 annotated (update): SoC These predictions are not totally made up. Anand Shimpi from anandtech also thinks we'll see some sort of 2nd generation swift core from Apple, in addition to a "Rogue" series GPU. Brian Klug, also from anandtech, thinks Samsung's 28nm process will be the process of the A7/A7X as well. Similarly, Andrew Cunningham from Ars Technica also suspects we'll see a modified Swift core along with a Rogue family GPU. CPU Concerning Swift, there's likely more blood to squeeze from the stone when it comes to CPU architecture. Qualcomm was able to improve their architecture in successive generations of Krait (their custom ARM v7s core), adding things like L2 pre-fetch and better branch prediction. Both of those boil down to more instructions executed at the same clock frequency (IPC, Instructions Per Clock). The reason for comparison between Krait and Swift goes beyond the fact that they are both custom designed. They share similar clock speeds and pipeline depth as well. The shorter pipeline tends to lead to slower clocked designs but also leads to smaller cores as less memory is needed to store data in between successive pipeline stages. When talking about Apple's CPU efforts, it's important to remember how they got the capability level they are at now. Acquiring Intrinsity, PA Semi, among others, and hiring top CPU architects from AMD shows how serious Apple is about top performing, custom CPU solutions. Apple also turned heads with their Macroscalar trademark last year, but it is unclear whether any technology related to this has come to market or why Apple felt the need to trademark what would likely be considered a trade secret design process. Moreover, all speculation about the trademark seems to come back to established methods of branch prediction and speculative execution. It will be interesting to see if this topic comes up again at all. For those interested for a much more technical dive into CPU speculation, I recommend visiting beyond3d forums. Posts such as this one go into great detail about how Apple may widen and strengthen the execution paths, arithmetic performance and branch prediction. The 64-bit rumor A week ago, a rumor popped up that Apple's A7 SoC would feature 64 bit CPU cores. The source, Clayton Morris, has a good enough track record the rumor to be credible. However, I believe it to be highly unlikely that Apple's A7 will be 64 bit. The first reason A7 will not be 64 bit is rather simple: their competition has not announced any designs for that form factor. Whatever technology Apple surprises us with in their SoC designs, there is usually some vendor who has announced a part with similar features. As of yet, there are no announced 64 bit ARM cores (A57, A53) for the smartphone or tablet form factors. Designs have taped out, but they're not launching anytime soon. Nvidia's Tegra 6, "Parker" SoC will be 64-bit but its predecessor "Logan" has even yet to launch. To put it simply, there is a lot of design time that needs to occur between ARM announcing a core (and ISA) and actual implementations reaching the market. Granted, full licensees of ARM cores such as Apple and Qualcomm have access to these cores and instructions sets before their official announcement, and the time to release of SoCs has been decreasing from these announcements as competition stiffens in the mobile space. There is still an issue of time needed to complete a design. The second reason 64 bit is unlikely is that there is an issue of manpower. Given Swift's full custom design in the A6 and A6x, it is likely that all future Apple CPU designs will also be fully custom. For Apple to launch a fully custom 64 bit A7 this year, it would have to have been developed in parallel with the Swift cores in the A6 and A6X. This is further complicated by the fact that A6X has many unique features over the A6 which require their own custom design. Intel, who also does full custom CPU cores, releases a new CPU architecture every Spring, and has alternating teams which do each successive CPU design. It is not known if Apple has multiple such teams, but it is an indicator of the engineering manpower needed to achieve such a release schedule. The last reason is that the need for 64 bit is ambiguous. ARM's 64 bit A57 core is rated at 4.1 DMIPS/MHz (Dhrystone Millions of Instructions Per Second/Megahertz) while Qualcomm's best Krait core is rated at 3.4 DMIPS/MHz compared to the stock Cortex A15 at 3.5 DMIPS/MHz (I suspect the Apple A6 has a similar DMIPS/MHz score to Krait given they have a similar number of CPU pipeline stages), so there's no question that a stock implementation would be faster than the A6 given the similarity of Swift and Krait CPU designs. It's important to note that many have suggested that ARM's aim with the 64 bit architecture is beyond mobile and aimed at laptop and even server implementations. Indeed, AMD has announced a 64 bit ARM part intended for servers. There were similar claims about the A15 being intended for bigger devices, with Samsung's Exynos 5 win for the Google Chromebook as a good example. Meanwhile, Qualcomm's Krait core which is based on the same instruction set but features a simpler, smaller core has enjoyed many of the design wins in the mobile device space. I should note that many of Apple's performance metrics are not publicly known, so it's difficult to determine the performance threshold Apple is booking for the move to 64 bit. This similarly complicates predictions about the transition to quad core. Of course, there is also the lack of references to 64 bit in any of the iOS releases, but given that iOS is built on the same core as OSX which is fully 64 bit, there's no doubt the foundation is there and 32 bit applications would be compatible. Going to quad core I have no doubt that Apple's A-series SoCs will eventually go to a quad core design. The problem is that it's difficult to tell when. Apple surely has many internal CPU profiling tools that tell them the core utilization of various bits of iOS. Indeed, references to quad core chips were showing up in even iOS 5. So, Apple likely has test chips that are quad core which they are testing the performance compared to dual core SoCs. It is also likely that Apple has 64 bit test chips for much the same reason, which may be the source of the 64 bit rumor. The move is a complex trade of die space, power consumption and CPU performance. Since none of those tools or results are publicly known, it's difficult to know when the performance will demand the switch. Whenever the switch is, it has to be worth the increased die area and power consumption (the added cores will increase leakage even if they are never used). So, the best answer I can give for a prediction here is maybe. big.LITTLE big.LITTLE is a heterogenous computing paradigm where simpler, smaller cores are paired with compatible complex cores so that they can be powered on in low CPU demand scenarios to save power. This is made possible by power gating, which effectively turns off all power to a CPU core when it is not in use. Nvidia actually implemented this concept in their Tegra 3 SoC before big.LITTLE was introduced alongside the Cortex A15 core. Their low power "shadow core" was able to power up in low demand CPU scenarios and use less power than a full CPU core would to accomplish small tasks. The issue with ARM's big.LITTLE is that it demands that the number of low power Cortex A7 cores match the number of full power A15 cores. This is why Samsung bills their Exynos 5 series as an "octa-core" SoC when in reality no more than 4 CPU cores can be in use at any one time. The fact that Samsung's flagship Galaxy S4 uses a qualcomm SoC rather than their own Exynos SoC in North America is very telling. Adding 4 low power cores to meet the big.LITTLE architectural requirements is simply not practical from a die area point of view and is likely one of the reasons why we've not seen such an implementation from Apple. If they were to do one, it would likely be a custom implementation that features only a single core such as Nvidia's tegra 3. Nvidia implemented the same concept in the Tegra 4 by featuring a single, low power Cortex A15 core. However, we can be sure that Apple will do the thing that optimizes their power usage, since they rarely fall prey to the "spec wars" where they market numbers rather than user experience. Thus, it is possible that we will never see a heterogenous CPU design from Apple. The most damning thing with regards to this approach is that Qualcomm's similarly designed Krait doesn't employ the scheme. In this PDF whitepaper, Qualcomm details what they call asynchronous Symmetrical Multi-Processor (aSMP) that allows each CPU core to have its voltage and clock frequency scaled independently, even when another core is in use. They claim that this granularity in power usage per core obviates the need for big.LITTLE all together. It is likely that Apple uses a similar scheme for their Swift cores. If they don't already, it is a very safe bet it is on their near-term roadmap. GPU All versions of the iPhone have featured a GPU from Imagination Technologies. In fact, Apple owns around a 10% stake in them as well (an interesting side note- ImgTec acquired Caustic Graphics, a company focused on creating dedicated ray tracing hardware that was comprised of former Apple engineers). It seems all but certain that Apple's A7 and A7X will feature ImgTec graphics core as well. All speculation surrounding the GPU assumes that the graphics cores will be from the Series 6 "Rogue" line of cores. MediaTek has announced a line of SoCs that feature Rogue cores and are launching in Q4 2013, so the timing seems feasible for Apple. ImgTec has made some wild claims about their new GPU architecture such as that it is 20 times faster than previous cores and 5 times more efficient. The points of comparison and metrics are unknown, but when you look at the theoretical FLOPs on the new cores, it's easy to see they're a lot faster. Looking at the above list in the link, I tried to make a prediction that had a moderate to substantial increase in FLOPs and MPOLY/s for both the A7 and A7X. Without knowing the size or typical power usage of any of these cores, it's difficult to guess what Apple may use. It's also important to note that all of these cores are also available in multi-core variants, so future versions of A-series SoCs may have MP2-4 versions of these cores. Each "core" is made up of a variable size of clusters from 1 to 6, hence the G61xx to G66xx names. It is somewhat useful to compare the different sizes to the SGX 535, 543, 544 and 554 from the 5 series, as they are scaled versions of one another in respect to same aspects of the execution paths. The variants with "30" on the end as opposed to "00" have added frame buffer compression logic. This would allow more data to occupy the same amount of memory or to effectively increase the bandwidth by pushing the same amount of data that is now compressed. It seems highly likely Apple would use these variants, as they often choose larger dies when the performance or efficiency gains justify it. Given that no G6600 exists, I've taken this as a potential clue that the next iPad will use the G6630, as it has the appearance of being tailor-made for Apple. Because of that, I've assumed that the next iphone will use the G6430, as it keeps it within a 2x factor of iPad GPU performance (as is historically customary) and also pegs its performance above that of the iPhone 5. Given that this architecture is more efficient, we'll likely be getting more performance for a given amount of FLOPs, so don't read these numbers and compare directly to the iPhone 5 and iPad 4's GPUs. It is difficult to know the target frequency Apple is aiming for on their GPUs. There is likely some performance curve that weighs more clusters versus more clock speed for a given performance. ImgTec's reference speed for these cores is 600 MHz, so I've assumed that will be the next iPad's GPU frequency. I've also assumed that the next iPhone will be lower, somewhat closer to 400 MHz. Although, I would not be surprised if the actual numbers were half of my guesses. Still, I expect Apple's GPU claims to say they are somewhere between 2x and 5x faster, depending upon how much they leverage ImgTec's big claim that they can be up to 20 times faster. Rogue cores will also bring compliance to some of the later graphics standards, such as openGL 4 and openGL ES 3. This means more advanced effects will be available to game and media developers should they choose to use them.