Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Could it be that the compiler is doing better for Intel systems (I noticed no one is comparing AMD scores, or are noting a change in performance with them) in this release than past ones? Are the GPU scores still scaling the same? They claim that the OS scheduler matters, but maybe the older chips are doing better because of how crappy Windows scheduler is for big.little systems?

WRT the 14900K vs 14700K I'd be curious to know if they ran the 2024 test then the 2026 test on the same system, back to back keeping everything equal.
Between AMD systems, older to newer chips, the difference isn't as drastic, 5% at most in favor of older AMD CPUs, with the same core count.
 
That just standard boilerplate for a new version of a benchmark. What we're doing, comparing score ratios is fine. What they asking people not do to is the thing that should be obvious not to do (but you know people ...) which is directly compare the 14700K score in CB26 with its score in CB24 - like wondering why it scored 10,000 in one and 3,000 in the other. Obviously with a different baseline and possibly engine changes the scores are indeed not directly comparable. Comparing the rank or ratio of the 14700K scores to another chip's like the 14900K in one benchmark to another benchmark, regardless of the relationship between the two benchmarks, is, again, totally fine.



I did above - first glance AMD looks similar to Intel (the 9950X3D got basically the same score as the 285K in both benchmarks so both newer x86 chips gained relative to M4/M5 chips and stayed static relative to M3s). But I haven't delved deeply into AMD's scores to confirm if older AMD chips catch up to newer ones. It's possible it's compiler differences, it's possible that the way the scene is being rendered has changed (more rays), etc ... the difference between new x86 chips and new Apple Silicon chips isn't huge <10%, so nothing to be terribly bothered by if that was the only difference.



EDIT: while newer AMD chips appear to maintain their relative scores to newer Intel chips between the two benchmarks, at first glance older AMD chips do not appear to catch up to newer ones in CB26 relative to CB24:




As far as I can tell, no, these are all just user generated benchmarks (well I'm assuming so for CPU Monkey but I don't actually where they get their data) with all the variance that entails. Given how extreme the 14700K differences are, something appears to be off. I would be cautious to assign fault to the benchmark itself without more data.


The 14700K results are the only ones I've seen that egregiously different and as @diamond.g said, we'd need to make sure the original results and the new ones are being run on the exact same systems - that people weren't quoting results for an overclocked model vs an stock/underclocked model, etc ... That's the problem with user generated benchmarks, unless like 3DMark's website they let you filter by the full range of specs (and even then it can be dicey simply because sometimes people benchmark because their system isn't working right). I mean on the flip side the 14900K in CPU Monkey gains like 10% on the 285K, but looking at other user generated data from computerbase, the possible range of 14900K results is ginormous.

That's why we would need a reviewer with the exact same system running both sets to make sure the 14700K results we're seeing aren't aberrant.

Overall though I'd be less bothered by changes of ~10% individually as heck even running the same benchmark on the exact same system can yield a few percent difference. So a change in the way the benchmark runs resulting in about 10% differences in relative scores would not be in and of itself be that unusual. It's the fact that there are at least 2 such changes that push older x86 CPUs to be more like 20% more performant relative to newer Apple Silicon chips that makes whatever changed a little more ... interesting. But that's just my threshold for curiosity, others may have a different one! However, again wrt the x86 chips only your test on the older Intel chip is, as far as I know, being run on the exact same system. So the variation there has to be taken with a big grain of salt. We need more people to upload that kind of data where we know the results from the two benchmarks came from the same system.
For the 14700K I was taking several different scores into account, to where the difference can be as high as 35% to about 25%, which is still way too much for a CPU that has just 4 fewer E cores compared to the 14900K and by that logic, the 2024 scores make a lot more sense.

Same as the M2 Pro outperforming the M4 and M3 Pro doesn't make sense either.

I'm just crudely saying they fd up the M4 and M5, what they call "optimization", in that case it's rather a "deterioration".
 
  • Like
Reactions: crazy dave
For the 14700K I was taking several different scores into account, to where the difference can be as high as 35% to about 25%, which is still way too much for a CPU that has just 4 fewer E cores compared to the 14900K and by that logic, the 2024 scores make a lot more sense.

Same as the M2 Pro outperforming the M4 and M3 Pro doesn't make sense either.

I'm just crudely saying they fd up the M4 and M5, what they call "optimization", in that case it's rather a "deterioration".
You're looking at cpu-monkey scores, right? You probably shouldn't get so hung up on weirdnesses in benchmark scores submitted by random members of the general public. It doesn't take much to screw up running benchmarks. The usual culprit is the user failing to quit everything that might be taking CPU time (or other system resources) away from the benchmark program.

Perhaps more importantly, I think you're operating on some incorrect assumptions about what to expect in some of these comparisons. Consider the CPU core counts in cpu-monkey's CB2026 results, paying close attention to the performance and efficiency core counts:

M3 Pro: 11C (5P + 6E), 3678 points
M2 Pro: 12C (8P + 4E), 4120 points
M4: 10C (4P + 6E), 3908 points
M5: 10C (4P + 6E), 4367 points

For Apple, P cores are lots faster than E cores. M2 Pro has by far the highest P core count in this comparison, so it isn't shocking to see it do well.

If anything, these results suggest the opposite of Maxon "deteriorizing" M4 and M5 - they're doing very well given how few P cores they have. The M3 Pro's low score is a reflection of the M3 Pro being kind of an odd duck; so far M3 is the only generation where the "Pro" chip was slanted towards E core count.
 
You're looking at cpu-monkey scores, right? You probably shouldn't get so hung up on weirdnesses in benchmark scores submitted by random members of the general public. It doesn't take much to screw up running benchmarks. The usual culprit is the user failing to quit everything that might be taking CPU time (or other system resources) away from the benchmark program.

Perhaps more importantly, I think you're operating on some incorrect assumptions about what to expect in some of these comparisons. Consider the CPU core counts in cpu-monkey's CB2026 results, paying close attention to the performance and efficiency core counts:

M3 Pro: 11C (5P + 6E), 3678 points
M2 Pro: 12C (8P + 4E), 4120 points
M4: 10C (4P + 6E), 3908 points
M5: 10C (4P + 6E), 4367 points

For Apple, P cores are lots faster than E cores. M2 Pro has by far the highest P core count in this comparison, so it isn't shocking to see it do well.

If anything, these results suggest the opposite of Maxon "deteriorizing" M4 and M5 - they're doing very well given how few P cores they have. The M3 Pro's low score is a reflection of the M3 Pro being kind of an odd duck; so far M3 is the only generation where the "Pro" chip was slanted towards E core count.
It's less about how they do relative to each other in CB26 in a vacuum, the issue is that the older CPUs gain on the newer CPUs in the newer benchmark relative to CB24, e.g.:


Same number of E-cores and P-cores, but (for MT) the M2 Pro is only 61% of the M4 Pro's score in CB24 but 74% of the M4 Pro's score in CB26. A nearly 20% relative increase. In ST, little relative change. That's a little odd. Even for user submitted benchmarks. And it's fairly consistent across Apple Silicon results, the older the chip, the better it does relative to the newer chip. Here's the M3 Max versus M4 Max:


Here the M3 gains about 10% on the M4 while the M2 gained on the M4 by 20% in CB26 versus CB24.
 
Last edited:
And it's fairly consistent across Apple Silicon results, the older the chip, the better it does relative to the newer chip.
It's not, though - here's a counter data point (M2 4P+4E vs M4 4P+6E):


In CB2026, the M2 score is 59% of M4, in CB2024 it's 58%. Basically the same ratio.
 
  • Like
Reactions: crazy dave
It's not, though - here's a counter data point (M2 4P+4E vs M4 4P+6E):


In CB2026, the M2 score is 59% of M4, in CB2024 it's 58%. Basically the same ratio.
I did say fairly consistent. :) That said, that's the first data point I've seen that bucks the trend for Apple Silicon where they match. So I went and looked at the base M3 (which wasn't available when I first started checking) and similarly it also does not gain on the base M4.

That said, so far, every other data the pre-M4 CPU gains on the new one and I've not seen the other way around, where the M4 or M5 extends its lead, to suggest that the effect is random noise. That said, the CPU Monkey data set is incomplete and flawed (the M3 Ultra CB24 results are clearly flawed and contradicted by other sources). So I absolutely concede that the pattern may fade. But some of it backed up by computerdatabase's data as well and, again, the data we have so far is unidirectional. If there is a difference between CPUs between the two benchmarks, the older generation gains on the newer CPU in the newer test and never does the newer CPU pull significantly further away. If that pattern continues to hold, then the noise would be in how much the older one gains, from nothing to quite substantial, rather than there being no shift at all.

It's even possible it's the base M4 that's the outlier as it gains about 7% on the base M5 in the new test:

 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.