Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Kronsteen

macrumors member
Original poster
Nov 18, 2019
76
66
** only of interest to those who like geeking out with performance measurements and tuning, as I do 😬 **

It seems to me that, for processors that have more than one variant with the same name, GB numbers can be misleading.

For example, at this point in time, the ‘Metal Benchmarks’ table, which lists processors, currently shows a figure of 148,806 for the M3 Max. However, if you search for Mac15,9 (which is the 16/40 core M3 Max) all but one of the 56 Metal scores are over 150,000, with a couple over 160,000. There is one below 150,000, namely 148,520.

So the average is well over 150,000, which makes me question the validity of the 148,806 figure in the Metal table. Similarly, the Metal table shows 208,503 for the M2 Ultra, but if you look at the Metal section of the ‘Mac Benchmarks’ table, the top two entries, for the 24/76 core versions of the Studio and Pro, are both around 220,000.

My suspicion is that the figures in the Metal table are blending results for both versions of the processors. Does that seem plausible? If correct, it may be that the M3 figures in the Metal table will drift down, as more instances of the binned processors’ results are submitted.

Andrew
 
Apple only recently started separating the model identifiers between binned vs unbinned since I think M2 Pros MBPs. Therefore you are correct, the ones previous have mixed results between less and full cores.

What also adds to the confusion is the amount of unified memory, which is now also the ceiling of VRAM. While a Metal GB score shouldn't change that much if you max the RAM out, but there is a slight difference there. The different RAM configs are mixed as well.

I myself think it is fine, GB is not supposed to be a very accurate benchmark, we just need a common ground to start with, and one that many people use so the data pool is sufficiently large.
 
** only of interest to those who like geeking out with performance measurements and tuning, as I do 😬 **

It seems to me that, for processors that have more than one variant with the same name, GB numbers can be misleading.

For example, at this point in time, the ‘Metal Benchmarks’ table, which lists processors, currently shows a figure of 148,806 for the M3 Max. However, if you search for Mac15,9 (which is the 16/40 core M3 Max) all but one of the 56 Metal scores are over 150,000, with a couple over 160,000. There is one below 150,000, namely 148,520.

So the average is well over 150,000, which makes me question the validity of the 148,806 figure in the Metal table. Similarly, the Metal table shows 208,503 for the M2 Ultra, but if you look at the Metal section of the ‘Mac Benchmarks’ table, the top two entries, for the 24/76 core versions of the Studio and Pro, are both around 220,000.

My suspicion is that the figures in the Metal table are blending results for both versions of the processors. Does that seem plausible? If correct, it may be that the M3 figures in the Metal table will drift down, as more instances of the binned processors’ results are submitted.

Andrew

The aggregate scores provided by GB dashboard are useless anyway IMO and should never be consulted. You already mentioned one big issue, which is aggregating across different models. The second big issue is that it aggregates together all results regardless of power mode, system contention, and for x86 machines, whether the computer is plugged is not or overclocking.

What they should have done instead is display score distribution. Different groups would be immediately visible there.
 
  • Like
Reactions: souko and goldpin
Apple only recently started separating the model identifiers between binned vs unbinned since I think M2 Pros MBPs. Therefore you are correct, the ones previous have mixed results between less and full cores.

What also adds to the confusion is the amount of unified memory, which is now also the ceiling of VRAM. While a Metal GB score shouldn't change that much if you max the RAM out, but there is a slight difference there. The different RAM configs are mixed as well.

I myself think it is fine, GB is not supposed to be a very accurate benchmark, we just need a common ground to start with, and one that many people use so the data pool is sufficiently large.
No, you’re missing the point. Common ground means nothing unless it’s defined. GB has many hidden variables, and beyond that they naively aggregate data which hides even more.
 
No, you’re missing the point. Common ground means nothing unless it’s defined. GB has many hidden variables, and beyond that they naively aggregate data which hides even more.
I got the point. I concur that GB6 database is misleading if the reader does not take into account of what was noted in your thread here. But then I add to this my opinion, that GB6 by itself has other places where it is uselessly irrelevant because it is just a general purpose collection of tests. Its usefulness wasn't that great to begin with, but I am glad we had at least one test that even a tech tuber with virtually no tech understanding uses it, and by extension the viewers. It is only by being so in-concise that the public is willing to use this.

I can make an even bigger case with the BlackMagic disk speed test. That thing is so useless, yet just 2 pretty looking dashboard meters are so easily understood even by morons on the internet, that it got its spot.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.