2cents:
About M5 being no big deal because it's still N3P (which is still an improvement over N3E): There have been several whisperings about Apple going the vertical stacking route - much like AMD did with their X3D CPU dies. I have good reason to believe that "M5" will be basically a refined M4 with a huge amount of system level cache bolted on top of it (maybe?? under it - but I doubt it. Apple still runs very, very modest frequencies) - and decidedly not with a "lot more" GPU/NPU logic on top of the base die as many suggested. Reasons: a) this would fragment logic groups with all its disadvantages as seen in the M1/2 Ultra, b) this would make TSVs really, really complex and hence the chip even more expensive, and c) it would quickly lead to bandwidth starvation if they didn't dramatically expanded the memory subsystem - even IF they used LPDDR6 which isn't a given - and hence make the package really expensive.
It maybe doesn't sound as exciting as double the GPU cores - but consider this: AMD put 64 MBs of Cache on half of the 70mm2 a Zen 4 CPU die has. Since the M4 Max has around 500mm2 .... half a gig of "additional" cache isn't far fetched. Heck, even a full gig wouldn't be. And I wouldn't put it past Apple to use this memory as either "L4" cache or dedicated memory, depending on what they need. They have full control over the OS and all system level APIs that would need to implement this and be aware of it.
So let's just assume M5 Max has 512 MBs of whatever fancy marketing mumbojumbo Apple is going to name it. That's enough to run entire system level processes in it, drastically reducing power, especially if those are tied to E-Cores, or running entire compute threads right on the silicon at minimal latency. That's enough to keep a large portion of recurring textures or basically the entire geometry right on the chip, freeing up crucial bandwidth for the GPU and/or do really well with ray tracing. But most importantly: that's enough to run entire layers of quantized AI models right on the silicon, without them ever touching system memory, again alleviating bandwidth limitations while offering extremely low latency, which I can't even begin to overstate what it would mean for AI performance. Which would absolutely fit what Apple intends to use this hypothetical thing for aside from putting it in stupidly fast small form factor workstations: AI Servers.
If all this would be true ... it would be a rather expensive chip, but also much more than a mere node shrink. It would be the biggest thing since the M1. However: I don't think there will be an M5 classic. Vertical molding is still pretty damn expensive packaging, and I don't see a way to improve the M4 meaningfully while staying on technically the same node. Frankly: not even sure there will be an M5 Pro. I wouldn't be surprised if the "M5" just comes as M5 Max, M5 Ultra and ..... yeah, who knows. If they now figured out how to wire 4 M5 Max together over something like infinity fabric in a 1-away configuration (basically like original EPYC Naples aka 7001) at low latency AND solve the NUMA Problem (which they can since, again: they control the OS).... the Mac Pro could actually deserve the name again, and maybe even take the fight to NV's AI workstations.
About M5 being no big deal because it's still N3P (which is still an improvement over N3E): There have been several whisperings about Apple going the vertical stacking route - much like AMD did with their X3D CPU dies. I have good reason to believe that "M5" will be basically a refined M4 with a huge amount of system level cache bolted on top of it (maybe?? under it - but I doubt it. Apple still runs very, very modest frequencies) - and decidedly not with a "lot more" GPU/NPU logic on top of the base die as many suggested. Reasons: a) this would fragment logic groups with all its disadvantages as seen in the M1/2 Ultra, b) this would make TSVs really, really complex and hence the chip even more expensive, and c) it would quickly lead to bandwidth starvation if they didn't dramatically expanded the memory subsystem - even IF they used LPDDR6 which isn't a given - and hence make the package really expensive.
It maybe doesn't sound as exciting as double the GPU cores - but consider this: AMD put 64 MBs of Cache on half of the 70mm2 a Zen 4 CPU die has. Since the M4 Max has around 500mm2 .... half a gig of "additional" cache isn't far fetched. Heck, even a full gig wouldn't be. And I wouldn't put it past Apple to use this memory as either "L4" cache or dedicated memory, depending on what they need. They have full control over the OS and all system level APIs that would need to implement this and be aware of it.
So let's just assume M5 Max has 512 MBs of whatever fancy marketing mumbojumbo Apple is going to name it. That's enough to run entire system level processes in it, drastically reducing power, especially if those are tied to E-Cores, or running entire compute threads right on the silicon at minimal latency. That's enough to keep a large portion of recurring textures or basically the entire geometry right on the chip, freeing up crucial bandwidth for the GPU and/or do really well with ray tracing. But most importantly: that's enough to run entire layers of quantized AI models right on the silicon, without them ever touching system memory, again alleviating bandwidth limitations while offering extremely low latency, which I can't even begin to overstate what it would mean for AI performance. Which would absolutely fit what Apple intends to use this hypothetical thing for aside from putting it in stupidly fast small form factor workstations: AI Servers.
If all this would be true ... it would be a rather expensive chip, but also much more than a mere node shrink. It would be the biggest thing since the M1. However: I don't think there will be an M5 classic. Vertical molding is still pretty damn expensive packaging, and I don't see a way to improve the M4 meaningfully while staying on technically the same node. Frankly: not even sure there will be an M5 Pro. I wouldn't be surprised if the "M5" just comes as M5 Max, M5 Ultra and ..... yeah, who knows. If they now figured out how to wire 4 M5 Max together over something like infinity fabric in a 1-away configuration (basically like original EPYC Naples aka 7001) at low latency AND solve the NUMA Problem (which they can since, again: they control the OS).... the Mac Pro could actually deserve the name again, and maybe even take the fight to NV's AI workstations.