Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
What if the NPU was removed. With the M5's addition of neural acceleration to the GPU cores we're seeing a bit of redundancy here aren't we? That's a bit expensive at the chip level.

I feel like it was an always a bit of a bandaid solution anyway - to look like they had an active hardware roadmap for AI, especially on the A-series chips. But in my (limited) experience, you either used the gpu OR npu at the app level for acceleration, and when the gpu core counts increase withe the pro, max and ultra the npu benefit falls behind performance-wise to an almost irrelevance.

It's possible, if any of the above holds water, that the M5 is a transition to a full-fledged tilt at building a scalable gpu-centric response to pytorch/cuda.

The NPU fulfills a different purpose. In fact, Apple Silicon (since M1) contains at least three different ML accelerators, all optimized for different use cases:

- NPU for energy-efficient ML inference accelerating common applications
- AMX/SME for programmable low-latency scientific and ML workloads on the CPU
- GPU for scalable, programmable ML (think research, development, and large models)

The GPU matrix units do not replace the NPU. The later is designed to support common application use cases and do so with very low power consumption. Removing the NPU would be actually detrimental to the user experience.

By the way, this is also the reason why every user-facing platform contains an NPU nowadays. The GPU is a large truck - you use it to haul containers full of stuff. It’s not a practical tool for your daily commute.



... and Apple was widely regarded as being flat-footed when the "AI craze" rolled in. Which only further makes the point that it was a band-aid solution for something that it was never designed for. If its machine learning functions can be readily accommodated by a more robust and capable solution, why hang on to it?

Because they did not have AI models ready, and their hardware was not suitable for scalable ML. But for on-device ML inference Apple has always been the state of the art - and still is.

My guess is that they initially though that scalable ML will remain a specialized niche, so they focused on the application side of things and decided that generic GPU hardware solution is “good enough” for researchers and developers. They obviously underestimated the interest. Adding matrix acceleration to the GPU merely serves to address this weakness.
 
Last edited:
... and Apple was widely regarded as being flat-footed when the "AI craze" rolled in. Which only further makes the point that it was a band-aid solution for something that it was never designed for. If its machine learning functions can be readily accommodated by a more robust and capable solution, why hang on to it?
Common narratives get set in concrete, but often have little to do with reality. The common narrative says that Apple missed on AI, and ignores anything and everything Apple does in order to not disturb that narrative.

[Edit to add] Apple missed the early opening on LLMs, and have dropped the ball on Siri, but they have a plan. I happen to think it's a good one that will be a great one if the AI bubble bursts as I think it will, but will still be a good one if it doesn't. The narrative has to ignore that in order to keep insisting Apple isn't doing anything.
 
Last edited:
So the M6 rumors have me pretty hyped. From what I'm reading, we're getting three major changes all at once:

1. TSMC's 2nm node (vs current 3nm)
2. WMCM packaging (replaces InFo - allows side-by-side or stacked chip components)
3. New architecture (rumors of modular design)

TSMC claims 2nm gives around 15% performance boost, but when you stack all three of these changes together... could we be looking at >50% performance improvement over M5? That seems almost too good to be true but the packaging + architecture changes could add a lot on top of the node shrink.
But #2 and #3 will happen in M5 Pro/Max with the (rumored) introduction of SoIC chiplets. The (rumored) transition to WMCM is for A20/M6, not M6 Pro/Max.

So M6 Pro/Max will be a nice improvement over M5 Pro/Max, thanks to #1, but the leaps you are speculating about are from M4 Pro/Max, not M5 Pro/Max. You can’t use A19/M5 to make assumptions about M5 Pro/Max/Ultra if the rumors are accurate and it will have SoIC architecture.

The big question though:

Could WMCM packaging enable Ultra-style configurations in the MacBook Pro?

Right now Ultra is two Max dies connected together, which generates a ton of heat - fine for a desktop but impossible in a laptop. But if WMCM lets Apple arrange components more efficiently with better thermals, could we actually see something like an M6 "Ultra" variant in a 16" MBP? Or at least some kind of beefed-up configuration that wasn't thermally feasible before?

I'm probably being too optimistic here, but the new packaging seems like it opens doors that didn't exist with the current approach.

And, what kind of performance boost can we realistically be targeting with 3 major updates to the M series?

A dedicated GPU chiplet could mean 60-80 GPU cores in an M6 Max!

What do you think - realistic or just wishful thinking?
M5 Ultra will already have two of the three major updates you are talking about built into the M5 Max. TSMC has made it very clear that SoIC is designed to work with UltraFusion.

I’ll guess M5 Max itself will likely incorporate the kinds of improvements you are talking about. It will be like the M3 Max, with two tiers, but the tiers will be about different SoIC integrations, not just binning.

In short, we won’t need to wait for A20/M6 to get a sense of what Apple might do with chiplets. It’s coming in M5 Pro/Max/Ultra.
 
Last edited:
  • Like
Reactions: lapstags
... and Apple was widely regarded as being flat-footed when the "AI craze" rolled in. Which only further makes the point that it was a band-aid solution for something that it was never designed for. If its machine learning functions can be readily accommodated by a more robust and capable solution, why hang on to it?
Apple was caught flat footed because they didn't have any answers to frontier models, which need to be trained and run in the cloud. It had nothing to do with the ANE or GPU. Apple is still behind way way behind by the way.

I don't think Apple will get rid of the ANE since it's very useful in low latency, lower power scenarios that the iPhone and iPad require. When Apple uses the ANE in their mobile applications, they will be used in their desktop versions when they port them over. Applications just assume that there is always an ANE whether it's on iPhone or Mac or Apple TV.

However, it remains to be seen how Apple will approach the ANE and GPU in the LLM era. GPUs are better at running LLMs but Apple needs to make local LLMs work well in an iPhone first and foremost. Which one will they spend more silicon on?

On desktops, it makes sense that it is the GPU since power is less of a concern. On mobile, I'm not convinced that GPUs are the way to go. We could see a divergence where the desktops get bigger and bigger GPUs and the mobile chips get bigger and bigger ANE.
 
  • Like
Reactions: Macintosh IIcx
Apple was caught flat footed because they didn't have any answers to frontier models, which need to be trained and run in the cloud. It had nothing to do with the ANE or GPU. Apple is still behind way way behind by the way.

I don't think Apple will get rid of the ANE since it's very useful in low latency, lower power scenarios that the iPhone and iPad require. When Apple uses the ANE in their mobile applications, they will be used in their desktop versions when they port them over. Applications just assume that there is always an ANE whether it's on iPhone or Mac or Apple TV.

However, it remains to be seen how Apple will approach the ANE and GPU in the LLM era. GPUs are better at running LLMs but Apple needs to make local LLMs work well in an iPhone first and foremost. Which one will they spend more silicon on?

On desktops, it makes sense that it is the GPU since power is less of a concern. On mobile, I'm not convinced that GPUs are the way to go. We could see a divergence where the desktops get bigger and bigger GPUs and the mobile chips get bigger and bigger ANE.

I think power is less of a concern overall with both A and M-series silicon. Given that the base M4 MacBook Air outperforms the last (and top specced) Intel i9 MBP 16" without fans and while using significantly less power, I think that power consumption is not nearly as big a concern as it would be with x86 systems.

 
The application of these improvements will be done in service of a longer term roadmap. There are relatively clear trendlines in Apple Silicon performance. They'll seek to maintain those trends. That's not to say there won't be some discontinuities.
 
I mean if you have the M1 pro, why go to the M6 and not M6 pro
Maybe they/you are talking about M6 family in general, if thats the case then ignore my post
 
I mean if you have the M1 pro, why go to the M6 and not M6 pro
Maybe they/you are talking about M6 family in general, if thats the case then ignore my post

Oh, yeah I’m going for a max from the family (unless I change my mind and go pro again).
 
But #2 and #3 will happen in M5 Pro/Max with the (rumored) introduction of SoIC chiplets. The (rumored) transition to WMCM is for A20/M6, not M6 Pro/Max.

So M6 Pro/Max will be a nice improvement over M5 Pro/Max, thanks to #1, but the leaps you are speculating about are from M4 Pro/Max, not M5 Pro/Max. You can’t use A19/M5 to make assumptions about M5 Pro/Max/Ultra if the rumors are accurate and it will have SoIC architecture.


M5 Ultra will already have two of the three major updates you are talking about built into the M5 Max. TSMC has made it very clear that SoIC is designed to work with UltraFusion.

I’ll guess M5 Max itself will likely incorporate the kinds of improvements you are talking about. It will be like the M3 Max, with two tiers, but the tiers will be about different SoIC integrations, not just binning.

In short, we won’t need to wait for A20/M6 to get a sense of what Apple might do with chiplets. It’s coming in M5 Pro/Max/Ultra.
Hi, I think you’re correct on the SoIC thing but WMCM and GAAFET are 2nm/M6 features

I’m seeing now that we could see a chipper architecture for the M5 ultra because that interconnect would kickin it old school at this point.
 
Hi, I think you’re correct on the SoIC thing but WMCM and GAAFET are 2nm/M6 features

I’m seeing now that we could see a chipper architecture for the M5 ultra because that interconnect would kickin it old school at this point.

What the rumors are pointing towards for M5 Pro/Max are designs that share some commonalities with AMD's CPUs, especially the X3D series of chips. With X3D, additional cache memory (V-cache) is stacked on the die. In fact recent rumors point towards a new X3D part that would have 3D V-cache on both CCDs instead of just one as all current and previous X3D have done. Any Ryzen CPU with 12 or more cores runs dual CCDs, which is the closest analog on the x86 side to Apple Silicon (especially the Ultra variants) in terms of modularity in design.

Since TSMC builds these Ryzen CPUs as well as Apple Silicon, they already have experience in stacking components on the die, which would benefit Apple greatly if the rumors are accurate. Regardless of the underlying instruction sets, the differences between fabricating ARM and X86 chips are relatively minor and based on physical layouts and how the chips connect to the system board rather than fundamental differences in the materials themselves.
 
  • Like
Reactions: BenRacicot
why not going from M1 pro to M6 pro ?!

I'm fine using the M1 Pro right now. I think that the M4 Air has better compute than my M1 Pro so there is a case to be made for me for the base M6. What I really want, though, is an OLED display, preferably 4k but Apple doesn't do 4k. Face ID would be really nice as Touch ID doesn't work very well for me in the winter. So for me, non-CPU-related stuff is a bigger deal for me than the CPU. I mainly want to upgrade to get a lighter and smaller laptop than my 16 and even an M6 Air 13 would be on the table.

I have a medical issue right now and I have to be really careful for how much weight I can lift on the right side of my body and I'm grabbing my 3 pound Lenovo Yoga most of the time compared to my M1 Pro 16.
 
What do they mean by a modular design?
In the context of silicon architecture the OP means modular elements that are combined using various packaging technologies.

The OP was referring to WMCM (wafer-level multi-chip module) which is the next generation of TSMC’s InFO-PoP packaging. It is more flexible and the improvements should apply to both SoC like A20/M6 and SoIC like M6 Pro/Max (assuming the M5 Pro/Max SoIC rumor is accurate).
 
In the context of silicon architecture the OP means modular elements that are combined using various packaging technologies.

The OP was referring to WMCM (wafer-level multi-chip module) which is the next generation of TSMC’s InFO-PoP packaging. It is more flexible and the improvements should apply to both SoC like A20/M6 and SoIC like M6 Pro/Max (assuming the M5 Pro/Max SoIC rumor is accurate).
Would you think Apple would use WMCM to jigsaw a phone chip 'module' with a M class display engine 'module' for the cheap MacBook?
 
Would you think Apple would use WMCM to jigsaw a phone chip 'module' with a M class display engine 'module' for the cheap MacBook?
I don’t think there is a whole lot known about the specifics of WMCM — it’s all speculation based on the name, all that is really known is it will replace InFO-PoP.
 
Would you think Apple would use WMCM to jigsaw a phone chip 'module' with a M class display engine 'module' for the cheap MacBook?

Published Apple patents describe combining silicon dies manufactured at different processes to save costs and increase the effective transistor budget while keeping overall area small. I’d guess that auxiliary circuitry like the display engine, along with other I/O and maybe caches, would be a prime candidate for offloading to a second “cheap” die, while high-performance logic can remain on the cutting edge process.

As to what they actually intend to do, who knows, really.
 
Rather than just speculate on the CPU/GPU figures as progressives from M5->M6 I'd be more interested in knowing, or finding out eventually, how much LLMs and AI have influenced the M series silicon road map.

A couple of tech youtubers were given 4 loaded Mac Studios by Apple to let them play with RDMA over Thunderbolt support just released in macOS 26.2.
The Mac Studio, with it's unified memory and M3 Ultra cluster of GPUs, makes for a very compelling hardware platform to run the largest (700billion params) LLMs available if your requirement is for a locally hosted LLM.
The update supporting RDMA would indicate that Apple is paying attention to this which is unusual for a consumer focused company.

They aren't taking on NVidia, that's impossible given the massive head start NVidia has on the software for building LLMs but once an LLM is created the hosting of them is where it gets interesting for Apple's.

It's all about bandwidth rather than raw compute speed. So that's probably already a focus on the M roadmap.

Apple's move to unified memory gave them a huge leg up, intended or accidentally, that makes Mac Studio's probably the most cost effective and certainly power efficient hardware platform for locally hosted inference machines.
 
A dedicated GPU of any sort would mean Apple turning its back on years of existing practice.
They really need to if they want to compete with modern GPUs on any level. And considering how important Apple thinks AI is gonna be (we'll see haha) they might end up doing it. Which could be good news for users since we could do things like play the latest games
 
  • Like
Reactions: BenRacicot
A couple of tech youtubers were given 4 loaded Mac Studios by Apple to let them play with RDMA over Thunderbolt support just released in macOS 26.2.
The Mac Studio, with it's unified memory and M3 Ultra cluster of GPUs, makes for a very compelling hardware platform to run the largest (700billion params) LLMs available if your requirement is for a locally hosted LLM.
The update supporting RDMA would indicate that Apple is paying attention to this which is unusual for a consumer focused company.
How many Mac Studios would you need to connect that?
 
They really need to if they want to compete with modern GPUs on any level.

Apple just gets their reviewer/influencer/forum friends to talk about how wasteful high performance GPUs are, then say the following points:

- efficient performance
- "actual" pros don't need that power, they are just cashed-up enthusiasts (what they said of Mac Pro users)

etc...
 
They really need to if they want to compete with modern GPUs on any level.

No they don’t. Why do you people keep up this nonsense?

And considering how important Apple thinks AI is gonna be (we'll see haha) they might end up doing it.

Hardly. We’ve already seen their response with M5. dGPUs didn’t factor into this at all.

Which could be good news for users since we could do things like play the latest games

Playing the latest games depends on a lot more than dGPUs being available.
 
  • Like
Reactions: jouster
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.