Mac With Apple-Designed Arm Processor Coming in First Half of 2021

theorist9 · Mar 1, 2020

chucker23n1 said:
One doesn’t follow from the other. The CPU could have a thread spinning while polling for an I/O result.

How likely is that, given that these aren't big files (e.g. even my ~200 page highly-formatted Word document is < 1MB), and both they and the active application are all already loaded into RAM? Could such polling take 6 seconds?

chucker23n1 · Mar 1, 2020

theorist9 said:
How likely is that, given that these aren't big files (e.g. even my ~200 page highly-formatted Word document is < 1MB), and both they and the active application are all already loaded into RAM? Could such polling take 6 seconds?

Depends on what you did. But yes, CPU stalls due to synchronous locks are still very likely these days.

Finder can lock up for tens of seconds waiting for network IO. (It doesn’t typically waste much CPU, though.)

cmaier · Mar 1, 2020

theorist9 said:
How likely is that, given that these aren't big files (e.g. even my ~200 page highly-formatted Word document is < 1MB), and both they and the active application are all already loaded into RAM? Could such polling take 6 seconds?

How do you know they are loaded into RAM and there isn’t paging going on? (Just curious). Absolutely nothing office does should ever cause much in the way of beach balling due to CPU usage, and I’ve never seen it, so as a CPU designer I’m curious what you are seeing. Office doesn’t even make the list of apps x86 processor people typically use for performance benchmarking because it never causes enough CPU pressure to be interesting.

theorist9 · Mar 1, 2020

cmaier said:
Well, I guarantee you that even the ARM used in the existing iPad Pros will easily handle office applications as well as a MacBook Pro (if given a MacBook pro’s thermal solution, of course. Heat sink and fans).

As for photo and video processing, photoshop does have a lot of legacy cruft that doesn’t exist in more modern applications. And certainly video is unlikely to bound by integer CPU ops in any modern software. But, again, apple’s top mobile ARM, given a heat sink and fan, will do as well as a current midrange MBP.

Mathematica with arbitrary and exact math would be the most interesting cases, but apple’s processors do have single threaded performance comparable to many intel parts.

But those statements about the relative performance of ARM vs. Intel CPUs for real-world applications are just assertions. Do you have any data to support them? I'd be very interested to see such ... as you know, that's what I requested in my original post on this subject. It's one thing to look at synthetic benchmarks, and quite another to test actual applications.

cmaier · Mar 1, 2020

theorist9 said:
But those statements about the relative performance of ARM vs. Intel CPUs for real-world applications are just assertions. Do you have any data to support them? As you know, that's what I requested in my original post on this subject. It's one thing to look at synthetic benchmarks, and quite another to test actual applications.

Synthetic benchmarks are actually quite useful in understanding how real world performance will be, so long as you understand what the synthetic benchmarks measure and how the real world works. That;s why i am curious about the “real” workloads you refer to.

We know the clock rate and the IPC for Apple’s chips, so we know quite a bit. Believe it or not, when we sit down to figure out the architecture for a new x86-64 chip, we rely on synthetic benchmarks to model performance as well. And real world performance seldom differs very much at all from our predictions.

theorist9 · Mar 1, 2020

cmaier said:
How do you know they are loaded into RAM and there isn’t paging going on? (Just curious). Absolutely nothing office does should ever cause much in the way of beach balling due to CPU usage, and I’ve never seen it, so as a CPU designer I’m curious what you are seeing. Office doesn’t even make the list of apps x86 processor people typically use for performance benchmarking because it never causes enough CPU pressure to be interesting.

Just as an example, if I close all my other applications (to minimize background I/O), and I scroll back and forth through a large Word document, the CPU demand for Word will peak at ~100%, while the "Data read/sec" and "Data written/sec" are both at 0. Hence my assumption there's no paging.

I've had this issue with three generations of MBPs (going back to 2008), and all accompanying versions of Office for Mac. I also use Office for Windows under Bootcamp. There I've never had the issue. So my assumption is that Office for Mac is less well optimized, creating much higher CPU overhead. My guess is that most people don't work with documents as complicated as mine in Office for Mac, and are thus unlikely to encounter the issues I do.

firewood · Mar 1, 2020

theorist9 said:
Are there any sophisticated, reputable tests showing how close ARM processors have gotten thus far in competing with X86 desktop processors for typical heavy desktop tasks?

Not a useful data point, as there are no native RISC processor systems designed for heavy duty (with 50 Watt+ fans spinning) desktop tasks. Amazon has some large fast ARM systems, but only for heavy duty server tasks, not desktops tasks. Same with IBM Power mainframes and Fujitsu Sparc supercomputers.

But there's nothing magic about desktop tasks. Put a big and fast enough cache subsystem, a big and fast enough memory subsystem, and a big and fast enough VM paging system, with any leading edge, high IPC, high clock rate CPU, and the performances will be similar. But a reasonable guess is that an ARM CPU so configured will burn less Watts than x86 for that similar performance.

chucker23n1 · Mar 1, 2020

firewood said:
Not a useful data point, as there are no native RISC processor systems designed for heavy duty (with 50 Watt+ fans spinning) desktop tasks.

Not sure what RISC has to do with anything, but, er, Power Macs?

firewood · Mar 1, 2020

theorist9 said:
Just as an example, if I close all my other applications (to minimize background I/O), and I scroll back and forth through a large Word document, the CPU demand for Word will peak at ~100%, while the "Data read/sec" and "Data written/sec" are both at 0.

Likely you are just testing how well the OS/GUI font rendering framework thrashes the cache subsystem. Very little to do with the CPU IPC.
[automerge]1583111368[/automerge]

chucker23n1 said:
Not sure what RISC has to do with anything, but, er, Power Macs?

When the G5 first came out it was quite competitive with other desktop CPUs of the same vintage, depending on the exact task.

chucker23n1 · Mar 1, 2020

firewood said:
When the G5 first came out it was quite competitive with other desktop CPUs of the same vintage, depending on the exact task.

Right. Which is an example of a “native RISC processor systems designed for heavy duty (with 50 Watt+ fans spinning) desktop tasks”.

As for Apple Ax in particular, we simply don’t know how it would scale to the desktop. But RISC has nothing to do with it.

theorist9 · Mar 1, 2020

firewood said:
Likely you are just testing how well the OS/GUI font rendering framework thrashes the cache subsystem. Very little to do with the CPU IPC.
[automerge]1583111368[/automerge]

That may be. It's not what causes the spinning beachball; I was just trying to give a reproducible example in which the CPU was pegged while the I/O was inactive. My spinning beachballs more commonly come when I do operations like changing the formatting.

[automerge]1583112078[/automerge]

theorist9 said:
Just as an example, if I close all my other applications (to minimize background I/O), and I scroll back and forth through a large Word document, the CPU demand for Word will peak at ~100%, while the "Data read/sec" and "Data written/sec" are both at 0. Hence my assumption there's no paging.

[Edit: The above is not what causes the spinning beachball; I was just trying to give a reproducible example in which the CPU was pegged while the I/O was inactive. My spinning beachballs more commonly come when I do operations like changing the formatting. ]

I've had this issue with three generations of MBPs (going back to 2008), and all accompanying versions of Office for Mac. I also use Office for Windows under Bootcamp. There I've never had the issue. So my assumption is that Office for Mac is less well optimized, creating much higher CPU overhead. My guess is that most people don't work with documents as complicated as mine in Office for Mac, and are thus unlikely to encounter the issues I do.

[automerge]1583112440[/automerge]

firewood said:
When the G5 first came out it was quite competitive with other desktop CPUs of the same vintage, depending on the exact task.

I really liked my G5 (so quiet and reliable!), but it was definitely on the slow side compared to the Intel and AMD desktops others had when we benchmarked it using C++ programs. My memory on this is that it was competitive for integer operations but significantly slower for floating point operations. [It's possible I have the two reversed—it was a long time ago—but the PPC was definitely comparable for one type of calculation and significantly slow on the other.] I also remember Apple claiming it was faster than Intel with Mathematica, but when I tested it I found Apple seriously cherry-picked the particular Mathematica tasks in order to make that claim.

theorist9 · Mar 1, 2020

firewood said:
But there's nothing magic about desktop tasks. Put a big and fast enough cache subsystem, a big and fast enough memory subsystem, and a big and fast enough VM paging system, with any leading edge, high IPC, high clock rate CPU, and the performances will be similar. But a reasonable guess is that an ARM CPU so configured will burn less Watts than x86 for that similar performance.

But that's essentially saying the microarchitechture (i.e., the underlying organization of the chip, including, e.g.: https://en.wikipedia.org/wiki/Micro-operation#MACRO-FUSION) doesn't matter, only the specs do. But the microarchitechture does matter. Intel's current micorarchitechture is a product of successive years of highly sophisticated improvements and optimizations directed towards running traditional desktop-type programs. ARM simply doesn't yet have that history. I'd thus like to see it demonstrated that ARM's microarchitechture is as well-optimized for these traditional desktop-type programs.

RogerWilco6502 · Mar 1, 2020

In a way, we're sort of coming full circle. PPC was RISC-based and so is ARM. I'll be interested to see how they handle this, especially given they effectively supported running PPC applications until 2011. Hopefully they'll handle it well and make it possible to run the x86(_64) apps we all know and love as well as the new generation of ARM apps. Maybe a return of the FAT/Universal binary?

I do agree with what other people have said about them needing an extremely good x86_64 emulation layer if they want to make it possible to still run Windows and such.

cmaier · Mar 1, 2020

firewood said:
Not a useful data point, as there are no native RISC processor systems designed for heavy duty (with 50 Watt+ fans spinning) desktop tasks. Amazon has some large fast ARM systems, but only for heavy duty server tasks, not desktops tasks. Same with IBM Power mainframes and Fujitsu Sparc supercomputers.

But there's nothing magic about desktop tasks. Put a big and fast enough cache subsystem, a big and fast enough memory subsystem, and a big and fast enough VM paging system, with any leading edge, high IPC, high clock rate CPU, and the performances will be similar. But a reasonable guess is that an ARM CPU so configured will burn less Watts than x86 for that similar performance.

there have been many risc systems for desktop tasks, including ultrasparc, Sgi/mips, PowerPC,rs/6000, pa-risc, and dec alpha.

MagnusVonMagnum · Mar 1, 2020

It doesn't matter how well they compare (ARM vs. Intel) if the code doesn't support ARM to begin with (and emulated Intel would run MUCH slower). That's why it's a horrible idea. But Apple being control freaks would definitely choose screwing their user base over to get that control (like the stupid decision to kill 32-bit support for no functional reason and don't tell me it saved code maintenance and wiped a few megs of stuff off the install. It's 2020. Hard Drive space is cheap, even SSDs now and Apple's one of the richest companies in the world. They can afford to maintain their software while users cannot afford to lose software they depend on. Apple seems damn intent on making sure they do for some reason.)

cmaier · Mar 1, 2020

MagnusVonMagnum said:
But Apple being control freaks would definitely choose screwing their user base over to get that control (like the stupid decision to kill 32-bit support for no functional reason

You don’t think there is an advantage to running all processes in long mode instead of real mode?

Because there certainly is. Running everything in long mode is something like a 15% speed improvement, depending on the specific chip.

theorist9 · Mar 1, 2020

cmaier said:
As for photo and video processing, photoshop does have a lot of legacy cruft that doesn’t exist in more modern applications. And certainly video is unlikely to bound by integer CPU ops in any modern software.

It's not just Photoshop. Final Cut Pro also makes heavy use of CPU's. Indeed, it's precisely for this reason that Apple offers the Afterburner card, to offload some of that computational workload from the CPU's: https://support.apple.com/en-us/HT210748

Likewise, CPU computation is also important for AVID (a standard within the industry): http://www.avidblogs.com/how-avid-media-composer-uses-a-computer/
"Playing codecs smoothly in a timeline requires processing, which benefits from more CPU cores. Codecs with large raster sizes also benefit from more cores. Playing a timeline that contains Linked (AMA) clips plus many effects results in a higher stream count, especially with more complex codecs. With more cores there are more opportunities to distribute the processing of those streams effectively."

cmaier said:
Synthetic benchmarks are actually quite useful in understanding how real world performance will be, so long as you understand what the synthetic benchmarks measure and how the real world works. That;s why i am curious about the “real” workloads you refer to.

We know the clock rate and the IPC for Apple’s chips, so we know quite a bit. Believe it or not, when we sit down to figure out the architecture for a new x86-64 chip, we rely on synthetic benchmarks to model performance as well. And real world performance seldom differs very much at all from our predictions.

That may be true but, again, you're making claims here without providing the data to support them. In my work I often encounter technical experts that made bold claims that don't hold up when I investigate them critically. It's not because they're unskilled, it's just that even experts get things wrong. For instance, you yourself earlier made a bold claim (the CPU wasn't important for photo and video processing) that, when I investigated it, turned out to be problematic. So it's possible what you believe about ARM performance might also not fully hold. That's why I keep asking for data!

Further, from my work with databases, I know that benchmarks can be notoriously misleading unless they model what you are actually doing. So, since you seem to have internal access to such things, might you be able to, say, tell me what the WolframMark score would be for Mathematica if it were running on a modern ARM processor?

Here's all I have on ARM performance for Mathematica, which only allows for a crude estimate of how it might run on an ARM chip as fast as the iPad Pro's:

Wolfram has spent over 5 years optimizing Mathematica for the Raspberry Pi. The Raspberry Pi 4, which uses a "Broadcom BCM2711 SoC with a 1.5 GHz 64-bit quad-core ARM Cortex-A72 processor, with 1MB shared L2 cache" (https://en.wikipedia.org/wiki/Raspberry_Pi#Processor), running the latest version of Mathematica, takes 77.8 seconds to complete the 15 tests in the WolframMark benchmark (https://blog.wolfram.com/2019/07/11/mathematica-12-available-on-the-new-raspberry-pi-4/).

My 2014 MacBook Pro (also quad-core) takes 6.68 seconds (11.6 x faster). So if other synthetic benchmarks indicate that the per-core speed of the ARM chip in the iPad Pro is ~11–12 times greater than the per-core speed of the ARM chip in the Raspberry Pi, then one could perhaps guess that its per-core speed for Mathematica is comparable to that of my 2014 laptop. Of course, a direct benchmark would be much more preferable.

MagnusVonMagnum · Mar 1, 2020

cmaier said:
You don’t think there is an advantage to running all processes in long mode instead of real mode?

Because there certainly is. Running everything in long mode is something like a 15% speed improvement, depending on the specific chip.

You don't think maybe people who want that improvement could choose to not run a 32-bit app during that time because they certainly could.... Your 15% quote is highly questionable as well, but when you want to win an unwinnable argument at any cost you'll do or say anything, obviously.

smulji · Mar 1, 2020

MagnusVonMagnum said:
You don't think maybe people who want that improvement could choose to not run a 32-bit app during that time because they certainly could.... Your 15% quote is highly questionable as well, but when you want to win an unwinnable argument at any cost you'll do or say anything, obviously.

Apple announced 64-bit support on the Mac at least ten years ago. The writing was on the wall then. Ten years is enough of a transition time for developers and users to get on board. Just because Apple can support 32-bit apps longer than 10 ten years doesn't mean they should. Look at MS - because of a fanatic dedication to backward compatibility they're having trouble moving forward so much so that they literally lost the smartphone wars and were blindsided by the iPad in the tablet market.

cmaier · Mar 2, 2020

MagnusVonMagnum said:
You don't think maybe people who want that improvement could choose to not run a 32-bit app during that time because they certainly could.... Your 15% quote is highly questionable as well, but when you want to win an unwinnable argument at any cost you'll do or say anything, obviously.

You don’t understand. As long as macOS offers the ability to run 32 bit code, parts of the operating system are running in real mode even if you don’t have any 32 bit apps. That means the processor can not operate fully in long mode regardless of whether you choose to run 32 bit code or not, and it has to operate in a hybrid mode that affects performance of 64 bit code.

rather than attack me personally, maybe stop making engineering claims when you don’t apparently understand how an AMD64 processor works.

firewood · Mar 2, 2020

cmaier said:
there have been many risc systems for desktop tasks, including ultrasparc, Sgi/mips, PowerPC,rs/6000, pa-risc, and dec alpha.

Key words are "have been" (unless you count museum pieces). The open question is whether Apple is changing that to "will be" (as in current production available from retail).

Also, if you include people who have to use Raspberry Pi's as desktops, they're actually fairly competitive in terms of MFlops/dollar, especially given Mathematica is bundled free. Hard to beat that if you're a 3rd world schoolhouse running off of a pedal power electrical system.

MikeZTM · Mar 2, 2020

MagnusVonMagnum said:
Wow. You REALLY aren't a computer user. You should just have a phone or iPad.... You clearly have no conception how much software some of us have that are just "home users". NONE WHATSOEVER. I have well over 200 software packages here on my notebook alone.

I'm a software developer that also doing some graphics work for UI so I do not think I'm a "light" web user as you see here.

I heavily use Xcode, Adobe CC and Visual Studio Code and Postman for development and in fact VS Code and Postman are "browser app" written in javascript.

If you listening Spotify then that's a browser app too.

You gotta be surprises how many app you are using today are in fact "browser apps".

MikeZTM · Mar 2, 2020

theorist9 said:
But that's essentially saying the microarchitechture (i.e., the underlying organization of the chip, including, e.g.: https://en.wikipedia.org/wiki/Micro-operation#MACRO-FUSION) doesn't matter, only the specs do. But the microarchitechture does matter. Intel's current micorarchitechture is a product of successive years of highly sophisticated improvements and optimizations directed towards running traditional desktop-type programs. ARM simply doesn't yet have that history. I'd thus like to see it demonstrated that ARM's microarchitechture is as well-optimized for these traditional desktop-type programs.

Microarchitecture is the spec. Cache delay, instruction cycles, all those things are result of microarchitecture.

x86 is not a advantage of Intel's CPU. It's a burden of the past. If they do not need to have this backward compatibility thing they can make their CPU faster.

During a interview of Intel's engineer for first gen Atom CPU they already claim the decode frontend of x86 (started with Pentium MMX) is already burning more power than a whole ARM CPU at that time so they have to go back to 80486 design for a traditional non-micro coded version of x86 CPU.

Currently ARM on our phones are "Application" optimized (Cortext-A for application for android). It's hard to think those chip will run less efficient than the legacy x86 chips.

cmaier · Mar 2, 2020

firewood said:
Key words are "have been" (unless you count museum pieces). The open question is whether Apple is changing that to "will be" (as in current production available from retail).

Also, if you include people who have to use Raspberry Pi's as desktops, they're actually fairly competitive in terms of MFlops/dollar, especially given Mathematica is bundled free. Hard to beat that if you're a 3rd world schoolhouse running off of a pedal power electrical system.

Sparc is still around, I'm told, as are PowerPCs. Both in workstations. At least that's what the people I know who are designing those chips still for Oracle and IBM are telling me - I haven't seen them.

gnasher729 · Mar 2, 2020

oneMadRssn said:
Unless they have an absolutely killer x86 and x64 interpreter, then the last Intel Macbook might be the last Macbook I buy.

There is just soooooooo much great software available for x86/x64 that is designed for traditional desktops and laptops that a switch away from x86/x64 would be giving up. Don't get me wrong, ARM is great but then might as well just have an iPad.

You are confusing operating system and processor.

Whatever Apple is selling, it will run the full MacOS X. Any software that is still actively maintained just needs to be recompiled and will run on MacOS X/Arm at full speed. And the current ARM CPUs are bloody fast when they are inside an iPhone with very limited power supply; they can be a lot faster in a laptop or an iMac where power and cooling are no problem.
[automerge]1583170061[/automerge]

theorist9 said:
But those statements about the relative performance of ARM vs. Intel CPUs for real-world applications are just assertions. Do you have any data to support them? I'd be very interested to see such ... as you know, that's what I requested in my original post on this subject. It's one thing to look at synthetic benchmarks, and quite another to test actual applications.

There are plenty of application tests. Any iOS developer can run an x86 version of their iOS software on their Mac.
[automerge]1583170118[/automerge]

smulji said:
Apple announced 64-bit support on the Mac at least ten years. The writing was on the wall then. Ten years is enough of a transition time for developers and users to get on board. Just because Apple can support 32-bit apps longer than 10 ten years doesn't mean they should. Look at MS - because of a fanatic dedication to backward compatibility they're having trouble moving forward so much so that they literally lost the smartphone wars and were blindsided by the iPad in the tablet market.

Latest shipping MacOS version doesn't support 32 bit apps anymore.

Mac With Apple-Designed Arm Processor Coming in First Half of 2021

macrumors 601

macrumors G3

Suspended

macrumors 601

Suspended

macrumors 601

macrumors G3

macrumors G3

macrumors G3

macrumors G3

macrumors 601

macrumors 601

macrumors 68000

Suspended

macrumors 603

Suspended

macrumors 601

macrumors 603

macrumors 68040

Suspended

macrumors G3

macrumors 6502

macrumors 6502

Suspended

Suspended

Our Staff