Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
What if the NPU was removed. With the M5's addition of neural acceleration to the GPU cores we're seeing a bit of redundancy here aren't we? That's a bit expensive at the chip level.

I feel like it was an always a bit of a bandaid solution anyway - to look like they had an active hardware roadmap for AI, especially on the A-series chips. But in my (limited) experience, you either used the gpu OR npu at the app level for acceleration, and when the gpu core counts increase withe the pro, max and ultra the npu benefit falls behind performance-wise to an almost irrelevance.

It's possible, if any of the above holds water, that the M5 is a transition to a full-fledged tilt at building a scalable gpu-centric response to pytorch/cuda.

The NPU fulfills a different purpose. In fact, Apple Silicon (since M1) contains at least three different ML accelerators, all optimized for different use cases:

- NPU for energy-efficient ML inference accelerating common applications
- AMX/SME for programmable low-latency scientific and ML workloads on the CPU
- GPU for scalable, programmable ML (think research, development, and large models)

The GPU matrix units do not replace the NPU. The later is designed to support common application use cases and do so with very low power consumption. Removing the NPU would be actually detrimental to the user experience.

By the way, this is also the reason why every user-facing platform contains an NPU nowadays. The GPU is a large truck - you use it to haul containers full of stuff. It’s not a practical tool for your daily commute.



... and Apple was widely regarded as being flat-footed when the "AI craze" rolled in. Which only further makes the point that it was a band-aid solution for something that it was never designed for. If its machine learning functions can be readily accommodated by a more robust and capable solution, why hang on to it?

Because they did not have AI models ready, and their hardware was not suitable for scalable ML. But for on-device ML inference Apple has always been the state of the art - and still is.

My guess is that they initially though that scalable ML will remain a specialized niche, so they focused on the application side of things and decided that generic GPU hardware solution is “good enough” for researchers and developers. They obviously underestimated the interest. Adding matrix acceleration to the GPU merely serves to address this weakness.
 
Last edited:
... and Apple was widely regarded as being flat-footed when the "AI craze" rolled in. Which only further makes the point that it was a band-aid solution for something that it was never designed for. If its machine learning functions can be readily accommodated by a more robust and capable solution, why hang on to it?
Common narratives get set in concrete, but often have little to do with reality. The common narrative says that Apple missed on AI, and ignores anything and everything Apple does in order to not disturb that narrative.

[Edit to add] Apple missed the early opening on LLMs, and have dropped the ball on Siri, but they have a plan. I happen to think it's a good one that will be a great one if the AI bubble bursts as I think it will, but will still be a good one if it doesn't. The narrative has to ignore that in order to keep insisting Apple isn't doing anything.
 
Last edited:
So the M6 rumors have me pretty hyped. From what I'm reading, we're getting three major changes all at once:

1. TSMC's 2nm node (vs current 3nm)
2. WMCM packaging (replaces InFo - allows side-by-side or stacked chip components)
3. New architecture (rumors of modular design)

TSMC claims 2nm gives around 15% performance boost, but when you stack all three of these changes together... could we be looking at >50% performance improvement over M5? That seems almost too good to be true but the packaging + architecture changes could add a lot on top of the node shrink.
But #2 and #3 will happen in M5 Pro/Max with the (rumored) introduction of SoIC chiplets. The (rumored) transition to WMCM is for A20/M6, not M6 Pro/Max.

So M6 Pro/Max will be a nice improvement over M5 Pro/Max, thanks to #1, but the leaps you are speculating about are from M4 Pro/Max, not M5 Pro/Max. You can’t use A19/M5 to make assumptions about M5 Pro/Max/Ultra if the rumors are accurate and it will have SoIC architecture.

The big question though:

Could WMCM packaging enable Ultra-style configurations in the MacBook Pro?

Right now Ultra is two Max dies connected together, which generates a ton of heat - fine for a desktop but impossible in a laptop. But if WMCM lets Apple arrange components more efficiently with better thermals, could we actually see something like an M6 "Ultra" variant in a 16" MBP? Or at least some kind of beefed-up configuration that wasn't thermally feasible before?

I'm probably being too optimistic here, but the new packaging seems like it opens doors that didn't exist with the current approach.

And, what kind of performance boost can we realistically be targeting with 3 major updates to the M series?

A dedicated GPU chiplet could mean 60-80 GPU cores in an M6 Max!

What do you think - realistic or just wishful thinking?
M5 Ultra will already have two of the three major updates you are talking about built into the M5 Max. TSMC has made it very clear that SoIC is designed to work with UltraFusion.

I’ll guess M5 Max itself will likely incorporate the kinds of improvements you are talking about. It will be like the M3 Max, with two tiers, but the tiers will be about different SoIC integrations, not just binning.

In short, we won’t need to wait for A20/M6 to get a sense of what Apple might do with chiplets. It’s coming in M5 Pro/Max/Ultra.
 
Last edited:
... and Apple was widely regarded as being flat-footed when the "AI craze" rolled in. Which only further makes the point that it was a band-aid solution for something that it was never designed for. If its machine learning functions can be readily accommodated by a more robust and capable solution, why hang on to it?
Apple was caught flat footed because they didn't have any answers to frontier models, which need to be trained and run in the cloud. It had nothing to do with the ANE or GPU. Apple is still behind way way behind by the way.

I don't think Apple will get rid of the ANE since it's very useful in low latency, lower power scenarios that the iPhone and iPad require. When Apple uses the ANE in their mobile applications, they will be used in their desktop versions when they port them over. Applications just assume that there is always an ANE whether it's on iPhone or Mac or Apple TV.

However, it remains to be seen how Apple will approach the ANE and GPU in the LLM era. GPUs are better at running LLMs but Apple needs to make local LLMs work well in an iPhone first and foremost. Which one will they spend more silicon on?

On desktops, it makes sense that it is the GPU since power is less of a concern. On mobile, I'm not convinced that GPUs are the way to go. We could see a divergence where the desktops get bigger and bigger GPUs and the mobile chips get bigger and bigger ANE.
 
  • Like
Reactions: Macintosh IIcx
Apple was caught flat footed because they didn't have any answers to frontier models, which need to be trained and run in the cloud. It had nothing to do with the ANE or GPU. Apple is still behind way way behind by the way.

I don't think Apple will get rid of the ANE since it's very useful in low latency, lower power scenarios that the iPhone and iPad require. When Apple uses the ANE in their mobile applications, they will be used in their desktop versions when they port them over. Applications just assume that there is always an ANE whether it's on iPhone or Mac or Apple TV.

However, it remains to be seen how Apple will approach the ANE and GPU in the LLM era. GPUs are better at running LLMs but Apple needs to make local LLMs work well in an iPhone first and foremost. Which one will they spend more silicon on?

On desktops, it makes sense that it is the GPU since power is less of a concern. On mobile, I'm not convinced that GPUs are the way to go. We could see a divergence where the desktops get bigger and bigger GPUs and the mobile chips get bigger and bigger ANE.

I think power is less of a concern overall with both A and M-series silicon. Given that the base M4 MacBook Air outperforms the last (and top specced) Intel i9 MBP 16" without fans and while using significantly less power, I think that power consumption is not nearly as big a concern as it would be with x86 systems.

 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.