Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Surely sticking to 8P cores and just adding 2E cores each generation isn’t sustainable. Especially since many pro apps, like Logic Pro don’t even use the E cores.

If the M3 Pro can get to 10P cores at least it gives some of us M1 Pro users a reason to upgrade.

Otherwise it’s likely to be another year on year decrease in sales isn’t it?
Absolutely true, which is why that 6:6:18 (p:e:gpu) data point was surprising.

The other unknown, of course, is "how much faster are the M3 p-cores, e-cores, gpus vs M1's. Without know that, it is hard to judge.

I would assume Apple's goal is that the performance is around 30-50% faster than the M1 Pro. And beyond any other new cool functionality, that will be sufficient to inspire upgrades?
 
Even if we are talking about shrinking the M2 cores to N5, I very much doubt that the density improvements alone would allow you to double the clusters for comparable die size. But surely the new chips will come with new features which need more space as well. Not to mention that more GPU cores would need a larger on-chip network, more cache, and more RAM bandwidth to support them.
If Apple can afford sticking with M2 die size but not M1's - it is possible to pack 6P/6E+18 GPU cores and another 64 bit memory controller into the 140-150 mm2 die.
Surely sticking to 8P cores and just adding 2E cores each generation isn’t sustainable. Especially since many pro apps, like Logic Pro don’t even use the E cores.

If the M3 Pro can get to 10P cores at least it gives some of us M1 Pro users a reason to upgrade.

Otherwise it’s likely to be another year on year decrease in sales isn’t it?
Unless - the leaked specs are not for Pro chip, but for basic M3.
 
  • Like
Reactions: lowkey
If Apple can afford sticking with M2 die size but not M1's - it is possible to pack 6P/6E+18 GPU cores and another 64 bit memory controller into the 140-150 mm2 die.

Can’t say I share your optimism. But who knows, never say never. They could also redesign their GPU to improve logic density (e.g. packing GPU cores into clusters that share fixed function and coprocessors), which could allow them to better compete with Nvidia.
 
  • Like
Reactions: DavidSchaub
This has been happening to me off & on since I got this M1 MBP in late 2020. It's happened on Big Sur, Monterey (Mostly), and Ventura (not as much).



No, not M1 specific. Memory Leaks have been a thing for quite some time. Here's a good article that explains them in some great detail:



That's the big question! In @TheYayAreaLiving 🎗️ 's issue, I don't think Mozilla can "fix" that. And in my issues, it's mostly been Apple's own Music app with the issue. If a first party app has this issue, STILL after all these years, I would expect a third party app to at some point, too.


Computers are hard I guess?

¯\_(ツ)_/¯

Much appreciate this.

Computers are hard.

I had thought macOS, since Puma, would resolve issues were RAM usage ram amok by applications, guess I was wrong.
 
I do believe there were a few ARM-specific memory leaks in macOS Big Sur. Most of those seem resolved by now.

As for Firefox, hard to say without more context who’s at fault.
 
M1 Pro with 10 CPU cores, 8P and 2 E and 16 GPU cores used 33.7 bln transistors with 256 bit bus.

For M3 to get 6P/6E, 192 bit bus and 18 GPU cores you need around 30 bln transistors.

Maximum transistor/mm2 density possible on N3 is 270 mln.

Apple usually achieves average of 75-80% of maximum density on any node.

If M3 will follow M2 with die size - 150 mm2 - what Apple needs to achieve is 200 mln transistors/mm2 density.

If Apple however was able to achieve higher density - then making the leaked specs of SOC as plain M3 - is very simple and beneficial.


Here is very good article pointing not only densities of M2, but also how much each CPU core both on the front of P and E cores uses space(and transistors).

So yes, all things considered from technical point of view, it is very much possible that the leaked chip by Gurman is plain M3. Not M3 Pro.

We dont know only two things, the die size of M3 and transistor density, but based on what is possible on N3E, and what Apple historically achieved, we can calculate certain requirements for it to be possible.

Is it really M3? We will see. Considering the timing of the leak, considering the fact what Apple is working on in the first place: M3 based iMac, MacBook Air and new entry-level MacBook Pro(coincidentally the leak comes from engineering sample of MacBook Pro) I think the highest chance is that it indeed is plain M3.

How would base model look?

I think 18 GB of RAM on 192 bus, 6P/4E and 16 GPU cores. Looks familiar?

Thats almost 1:1 the specs of M2 Pro chip. Why would Apple do this? To give even recent buyers a reason to upgrade, and push for bigger recognition of their hardware, especially in ... gaming.
 
Why are you using Firefox in 2023?
And what on earth are you using 80GB RAM for?
I'm surprised Brave isn't as popular as it should be. Blocks ads and trackers and supports Chrome extensions. No way I'll use the bloated Chrome after Brave. And Safari still feels clunky.
 
Last edited:
I certainly hope Apple won't lower themselves to engaging in meaningless benchmarks and spec chasing. As to AMX, I think increasing the size of the AMX coprocessor per cluster is a better investment of silicon simply because it's easier to exploit from the software side.

Meaningless benchmark chasing? No. Some benchmark chasing ... well they already do that. But adding cores to pump up core count isn't necessarily meaningless benchmark chasing. Apple charges money for cores. So they can't fall into the "meaningless" class ( if a value-add feature then it must have some meaning to command more money).

Lots of buyers aren't super tech savy. If they see $1K-2K Windows laptops with 16 cores and $1K-2K Mac laptop with 12 cores then pretty good chance someone (including themselves) might drift to the more cores options. Can say apples to oranges on cores, but not without going into technical stuff lots of folks don't want to hear. Apple is charging more for more cores so doesn't Apple think "more is better". Probably not the sole contributing reason, but it is a factor.

But in the E core cluster I suspect that the AMX coprocessor is also a single point of failure. If the AMX co-processor has a defect than the whole E-cluster is out of compliance. I would not be shocked at all that if there are two full size E core clusters, then there is an entry option where have just 6P+4E on the M3 Pro also (i.e., 10 cores is still the entry point).

It is a better investment, but is also a bit of a dual edge sword ( squeezing lots more space efficiency, but have more coupling also). [ E-clusters are small so they have a lower profile for defects to 'hit'. But it is Apple... squeezing extra money out of customers pockets .. they spend substantive effort at that. Apple going to want return on investment... not just investment. ]








I think this is a mischaracterisation on ArsTechnica's side. Intel's E-cores and Apples E-cores have completely different performance and power consumption characteristics and serve different purposes. Intel introduced small cores as an area-efficient way to boost multi-core throughput (at least for code with trivial thread dependencies). Apple's E-cores are there to conserve system energy and resources when executing low-priority or background task. They are obviously also used to improve throughout, but the contribution is rather minimal. Adding another E-core cluster to M2 would barely give you 10% improvements in trivial multicore benchmarks — is it really worth pursuing?

But on AMX anchored benchmarks is it really just 10% ?



Yeah, it's a huge problem. Looking at the die shots of M2 Pro and Max one can really see how Apple is struggling for space. I suppose this is the drawback of the SoC approach — their die has reached a size where improving performance comes at a very dear economical cost.

And hence why Arstechnica isn't all that far off. A major driver for Intel switch to adding their E cores is because they were trailing on fab process density. They needed more cores and didn't have die space budget to add them using 'big footprint' P cores.

Apple on M1 generation. Tossed a 1/2 E-core cluster in there.
M2 full E-core cluster but the die is more bloated now.

M3 and looks like pretty good chance no more P core cluster by added an E core cluster ... because probably trying to wrangle back die bloat. Apple fab process being used in 'behind the curve' , but the 10's and 10's of billions of transistors and gobs of die space thrown at non CPU core stuff stifles the budget for CPU cores. That 'way better Perf/Watt than those guys" is not completely 'free' (as in 'free beer'). There are tradeoffs.

The fab processes are different but the "running into the wall on die space budget" problem is largely the same. Intel and Apple are getting to the same place along somewhat parallel roads.


[ And Intel 4 and Intel 3 aren't really going to save the E-cores to save space issue. Intel is still playing more catch up on iGPU performance. So much so they are on TSMC n6 (and N5 (or N4?)) and wished they could be on N3 sooner. ]

AMD looks to be entering the same boat on monolithic dies. The 4C/5C are probably a bit less 'downsized' than the Intel E cores. Descriptions suggest that mainly tossing L3 cache to limbo down to smaller size. Perhaps a 'slower' AVX-512 to save space. (and since that stops shrinking for N3, makes sense to start practicing earlier on N5. ) Go from 8 core to 16 core chiplet but the L3/core ratio goes down to minimize the chiplet footprint bloat. I'm a bit dubious though of how effective those will be in laptop dies. Should work OK for cloud server workloads (and stay closer to Ampere One ), energy wise dumping the cache is likely a very incremental power saving gain.
(but intel has that laptop SoC via chiplet overhead though. So probably doesn't hurt competitiveness much . Unless, Qualcomm does theirs extremely well. They too have a 'limited die space' problem though if trying to slap the celluar modem on to the same die. Their whole 'one big die' thing seems dubious for the WindowsPC space. )




It also doesn't help that significant portion of an Apple Silicon SoC is occupied by SRAM which — from what I understand — doesn't scale that well with latest improvements in lithography. Maybe moving the cache to a separate stacked die would be a solution?

But the Holy Grail of Perf/Watt would be lower if moved that most of that SRAM off the main die. I think Apple will cling to the Pref/Watt priority even if that costs them more P cores. And if can't make more E-core clusters ( because of additional SRAM/cache overhead for the addition) can make 'more powerful', bigger logic E cores.

I don't think Apple is going to keep chasing the CPU core count on mainstream laptop/desktop much past 16. There is another whole performance war Apple is fighting with AMD/Intel/Nvidia that will put pressure on more GPU SRAM ( and core counts) demands also. The AMD and Intel SoC don't completely chuck a reasonable dGPU PCI-e interface so they have 'fall back' to more board space, thermals, and dies for better performance ( at far less battery life. It is a 'win' but it isn't completely 'free'. ) .
 
Surely sticking to 8P cores and just adding 2E cores each generation isn’t sustainable. Especially since many pro apps, like Logic Pro don’t even use the E cores.

I don't think it is two cores per generation.

M1 Pro/Max gimped 2 core E cluster ( half of the normal building block E-core cluster).

M2 Pro/Max undo the 'gimp' ... a normal 4 core E-core cluster.

M3 pretty it is just two E core clusters. So max out at 8 E cores. ( basically brining them into parity with two P core clusters ; 2*4 ).
[ And it is one of those hyper-symmetry things that seems to attract Apple's 'eye' often.
Same symmetry that is in the 'plain' M1/M2 . ]

M4 ...most likely ... better cores rather than more cores. Far more likely going to see NPU, AMX , GPU boosts in core counts or more specialized fixed function like image/video accelerators than more extremely general purpose cores. Some subset of workloads that were "high CPU core count" oriented 4-10 years ago will be pulled back into less generally useful logic that does that workload faster and at far lower power consumption levels. That is where the transistor budget will be skewed toward.

The major footprint of both the P-core cluster and the E-core cluster is the L2 cache. Toss in the AMX allocations for each and likely in the more than half the footprint range. TSMC N3 family through early N2 family probably aren't going to shrink L2 cache size at all. So can't increase core count without making die bigger. However, N3-N2 wafers are going to cost substantially more... so bigger dies means higher costs.

Throw on top Apple is fighting a major two-front 'war'. There are CPU core competitors in PC space ( Intel AMD). There are up-and-comers Qualcomm/Nuvia . And there are new GPU core competitors in PC ( AMD/Intel/Nvidia ) [ Intel stumbled hard out of the gate but if Adamantine works well


they may not lag far behind in iGPU zone for a long time. Probably not jump to winner, but just not way , way back.

Both AMD and Intel are trying to kill off dGPUs in most laptops. Not quite as radical stance as Apple's in killing of dGPUs in all laptops, but pretty close. ]


If Apple needs more than 8P/8E then should be looking for a far, far , far better chiplet design strategy rather than trying to pack them into a monolithic die.


The backside power and other gyrations that are going to be appied in 2025-2026 to start incremental progress on SRAM/cache shrinkage (density) increases are not likley to magically catch the back up to logic's density. It will be just less stuck and immobilized.


If the M3 Pro can get to 10P cores at least it gives some of us M1 Pro users a reason to upgrade.

Doubtful Apple added another P core cluster ( either half sized or full sized). 12 = 8 + 4 .
If Apple adds bigger AMX , bigger NPU , bigger GPU cores lots of M1 Pro users will still have reason to upgrade.

The way Apple organizes there cores in a 4 core cluster around as shared L2 cache is a dual edge sword. That means their clusters are coupled to the attributes of the L2 cache construction constraints about just as much as the individual cores. It is a shared resource. Less copying (between cores ) , but also more coupling in dependencies.

Similar with the 'abnormally large' giant L3 that all the cores share. Less copying but if L3 runs into construction constraints they all run into the same constraints.


Otherwise it’s likely to be another year on year decrease in sales isn’t it?

Extremely likely not. The year on year drop is because the Pandemic "buy everyone a laptop/PC with emergency money" bubble popped. If look at sales 2019 (or first half 2020 ) - 2023 there is no drop. ( just skip the bubble and look). The whole PC industry in in the mist of a bubble popping. This has exceeding little to do with the M1 vs M2 specifically.

If most of these 'boom" cycle systems get retired on a fixed schedule will likely see a mild boom/bust cycle down the road going forward. Again not much to do with the silicon and far more to do with customer standard refresh behavior.
 
  • Like
Reactions: lowkey and CWallace
Meaningless benchmark chasing? No. Some benchmark chasing ... well they already do that. But adding cores to pump up core count isn't necessarily meaningless benchmark chasing. Apple charges money for cores. So they can't fall into the "meaningless" class ( if a value-add feature then it must have some meaning to command more money).

What I mean is that Intel's recent proliferation of cores yield some impressive Cinebench scores, but the improvements in real-world workloads have been more sobering and overshadowed by the increased power consumption. Still, for some workloads more parallel throughput could be a good thing, but that needs cores that are actually good at throughput (see below for some more thoughts).

Lots of buyers aren't super tech savy. If they see $1K-2K Windows laptops with 16 cores and $1K-2K Mac laptop with 12 cores then pretty good chance someone (including themselves) might drift to the more cores options.

That's a very good point. Customer psychology is a force to be reckoned with. Apple so far has been able to avoid spec chasing to some degree, but I hear what you are saying.

But on AMX anchored benchmarks is it really just 10% ?

Would it be worth it to go for more clusters just for AMX? That's a fairly niche thing. There are not that many customers who use BLAS. I don't even think it's used by the big ML frameworks (which use MPS instead). Sure, having more AMX throughout will accelerate some scientific and maybe ML workloads, but that alone won't get Apple significant PR. M1 already can do matmul as well as a much larger Intel's desktop CPU, but I haven't seen this being used as a decisive reason to buy it.

Which brings us to the next point.

And if can't make more E-core clusters ( because of additional SRAM/cache overhead for the addition) can make 'more powerful', bigger logic E cores.

That would indeed be a way to use E-cores as an area-efficient auxiliary compute, and in fact, Apple has been steadily improving the E-core performance it for several. And yet, as of A16 the E-core is still only 1/3 of performance of the P-core. You'd have to stack quite a lot of those to get a meaningful improvement in multicore throughput. If Apple can get to 50-60% of P-core performance, yes, it would make a difference, but is it even possible without sacrificing the incredible power efficiency which makes E-cores so attractive on mobile?
 
It was a fun idea only using the light "spill" inside the case coming from the sides/backside of the monitor.
 


Apple is testing an unreleased chip with a 12-core CPU, 18-core GPU, and 36GB of memory, according to an App Store developer log obtained by Bloomberg's Mark Gurman. He said the chip is being tested inside a future high-end MacBook Pro running the upcoming macOS 14 update, which is expected to be announced at WWDC next month.

Apple-MacBook-Pro-M2-Feature-Blue-Green.jpg

In his Power On newsletter today, Gurman said this chip could be the base-level M3 Pro for the next-generation 14-inch and 16-inch MacBook Pro models launching next year. The chip is expected to be manufactured based on TSMC's 3nm process for significant performance and power efficiency improvements.

The current base-level M2 Pro chip in the 14-inch MacBook Pro has a 10-core CPU and 16-core GPU, and starts with 16GB of memory, so the M3 Pro chip would have at least two extra cores for both the CPU and GPU. Apple last updated the 14-inch and 16-inch MacBook Pro in January, so the laptops are unlikely to be updated again until at least 2024.

Apple still has to release the standard M3 chip before moving on to the M3 Pro and M3 Max chips. Gurman said Apple is working on new iMac, MacBook Air, and low-end MacBook Pro models with the M3 chip, and he continues to believe the first Macs with the M3 chip will be released towards the end of this year or early next year.

In the meantime, Gurman said the long-rumored 15-inch MacBook Air will be released this summer with the M2 chip. He previously said the laptop would be announced at WWDC, which begins with Apple's keynote on June 5.

Article Link: Gurman: Apple Testing 'M3 Pro' Chip for MacBook Pro With 12-Core CPU and 18-Core GPU


Ugh. 8 performance cores with 8 efficiency cores. ...the original M1 Pro had 8 performance cores and 2 efficiency cores.

Feels like Apple is stripping performance potential with these M2 and M3 chips by adding more efficiency cores vs. performance chips.
 
Last edited:
Ugh. 8 performance cores with 8 efficiency cores. ...the original M1 Pro had 8 performance cores and 2 efficiency cores.

Feels like Apple is stripping performance potential with these M2 and M3 chips by adding more efficiency cores vs. performance chips.
It depends on transistor budget.

If you have 60-70 bln of transistors to play with, for the M3 Pro - adding more efficiency cores to the chip is going to be more beneficial in terms of Power/Performance/Area equation than adding more performance cores.

If you have a budget of 40 bln, that will only be larger difference.
 
Ugh. 8 performance cores with 8 efficiency cores. ...the original M1 Pro had 8 performance cores and 2 efficiency cores.

Feels like Apple is stripping performance potential with these M2 and M3 chips by adding more efficiency cores vs. performance chips.

What @koyoot wrote. An 8+8 CPU that you can actually manufacture in sufficient quantities is better than a 12+4 chip that has costs much more to make and trashes the production line. So far, 4x E-cores are approximately same area as 1 P-core, and roughly 1/3 as fast (and will likely get faster). From the pure compute perspective, 8+8 will deliver 10-20% more throughput compared to 10+2, with the same area. Of course, this is just napkin math.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.