Shadow of the Tomb Raider is 2018 game and at least is very optimized for metal, I suppose this is the reason is heavily used for benchmarking on macOS (even if it run in Rosetta 2)I had a huge piece but deleted it. Because we need to compare Apples to apples and that subject is substantially more complex than even the OP hinted at.
Just be aware that a 2022 AAA video game supporting DLSS and ray tracing on a desktop RTX3080 on a 4K monitor will know no equals from Apple (nor AMD). But if you're playing 2015s Rise of the Tomb Raider at 1080p versus a WinPC laptop with DLSS turned off (and no ray tracing available), you will indeed be content with Apple's gaming specs...
The Verge is completely inept when it comes to computers. If you want to have a good laugh about how terrible they are when it comes to computer hardware advice look up “The Verge how to build a PC“ on YouTube. They’ve taken most of the reuploads down but there’s still some there.Great analysis and great job on letting the reviewers know.
The review on The Verge was particularly dismissive of the M1 Ultra GPU, for no good reason other than to bash on Apple in my opinion.
Of course Apple's charts are not to be taken as indisputable truth, but the gap between what was reported by The Verge and Apple claims is too large.
So large, in fact, that if the reviewer knew something about computers at all, he would run more tests and more benchmarks to confirm his findings.
Instead they just run Geekbench compute (which by now we know it's flawed) and Tomb Raider (which is not even native) to demonstrate their point.
I'm not gonna visit The Verge anymore. Their reviews are a rushed job at best and utter garbage at worst.
 
	Quick question,tl;dr - They are not fair comparisons.
I'm not going to be very deep but just enough to make you guys understand things.
1. Cinebench R23
CR23's render engine uses Intel Embree which is Intel's library to accelerate Ray tracing compute using CPU. It supports various SIMD instruction sets for x86 architecture and among these are SSE or AVX2. AVX2 is Intel's latest SIMD instruction set which is superior to SSE. And, CR23 is AVX heavy, so you know where this is going. Now, ARM's SIMD instruction set is NEON. But Intel Embree obviously doesn't support NEON native implementation. So, for CR23 to even run on Apple silicon, Intel Embree needs to be rewritten for ARM64 which thanks to Syoyo Fujita, became possible. Now, SSE or AVX2 intrinsics need to be translated to NEON intrinsics for every application which is a huge pain in the ass. But there's a library, it's a header actually, available to do that but it's only SSE2NEON and not AVX2NEON. Going by the Github comments for Apple's pull request on Intel Embree, Apple is working on bringing AVX2NEON support for Apple silicon. Even after that, I'm not sure if CR23 will be a fair comparison. Intel might introduce a superior SIMD instruction set and then Apple again has to do a pull request on Intel Embree for NEON translation? Man, that's PAIN.
2. Geekbench GPU Compute
First of all, I've seen a few comments here that you can't compare Metal vs CUDA. Not true. Geekbench is a cross-platform benchmark and it's perfectly fine to compare Metal vs CUDA. What is not a fair comparison is OpenCL comparisons since it's deprecated in macOS. But, the real issue is, for some reason, the GPU compute benchmark doesn't ramp up GPU frequencies or even consume close to maximum power GPU would consume when it's on full load for Apple silicon. How would this be a fair comparison when GPU is not even utilized to its fullest in Apple silicon? This was first noted in M1 Max/M1 Pro review as a comment by Andrei Frumusanu who is ex Anandtech and currently works at Nuvia.
3. Question you might have
A. If Geekbench GPU compute doesn't work as expected for Apple silicon, how can we compare GPU performance against Nvidia or AMD?
I would highly recommend GFXBench 5.0 Aztec Ruins High 1440p Offscreen and 3DMark Wild Life Extreme Unlimited. They both are native to Apple silicon supporting Metal and more importantly, really stress the GPU and give you a clear picture of the performance since they are offscreen tests. But keep in mind, 3DMark is still an iOS app. Not sure if there would be any penalty 'cause of that vs native windows implementation. And, no, SPECviewperf v2.0 doesn't support Metal if you are wondering.
Below are the screencaps from Dave2D's and Arstechnica's Mac Studio review:
View attachment 1975660
View attachment 1975663
B. If Apple Silicon GPUs are so powerful then why Blender benchmarks are underwhelming compared to that of Nvidia?
Two Reasons:
-> Blender 3.1 is just the first stable release supporting Metal in cycles and even Blender themselves in a video going over all the updates said that more performance optimizations for Metal are yet to come. I would definitely expect Apple silicon GPU to match CUDA scores of the latest Nvidia GPUs in blender benchmarks in the future.
-> But that's only in CUDA. Nvidia would still smoke Apple Silicon in Optix 'cause Apple doesn't have anything close to Optix since there are no Ray Tracing cores in Apple GPUs for Metal to take advantage of. I'd love to see Apple package RT cores in their GPU designs and optimize Metal to take advantage of those cores or even write separate API for accelerated ray tracing like Optix.
C. How can we compare the CPU performance of Apple Silicon against an x86 chip if CR23 is not fair?
As a consumer, I really don't know. Maybe, Blender benchmarks using CPU? If you're a professional, you already know about industry-standard benchmarks like SPEC, SPECint, SPECfp, etc. But I don't think anyone except Anandtech uses these benchmarks and the real problem is these YouTubers, man. It's just painful to watch and even more painful to read the comments of the viewers who take these benchmarks results as if it's all that matters when buying a machine.
D. Is there any game(s) out there that would be a fair comparison to measure GPU performance?
World of Warcraft. It's one of the very few games that's native to Apple Silicon and also supports Metal.
4. Final Note
I have reached out to Verge(Becca, Monica, Nilay, and Chaim) and Arstechnica(Andrew Cunningham) to correct them on their recent Mac Studio video/article. I didn't get any reply. I even reached out to Linux and MKBHD guys(Andrew, Adam, and Vinh) for their upcoming reviews with these points. But again, no reply. I don't blame them though. Maybe they didn't see my messages yet. I reached out via Twitter DM after all. Hence I wrote this post to bring little awareness to people who might not know about these details. Finally, it is very important to understand that Apple doesn't sell you SoCs. They sell you computers so make a choice wisely w/o falling for these youtubers or tech publications like Verge who run these benchmarks w/o doing any research on the tools they use and the inaccurate information that might come off of these results.
Cheers!
Past experience tells us that it is uncommon for Apple to engage in benchmark manipulation. But yeah, comparing Ultra to a 3090 is a bit… too much. It’s definitely on par with desktop 3080 though.
In what application? In OctaneX its more like 2080TI. And I'm talking about $1000 upgrade option not the base GPU.
It should show up towards the bottom of the benchmarks page.Quick question,
How are you running 3dmark Wildlife Extreme (Unlimited) on the RTX 3070? My copy of 3dmark does not have an option to run the Unlimited variant? I have only seen this option available on mobile devices. FWIW, I do have the "Advanced" version of 3Dmark on my PC.
Thanks!
Rich S.
It would be understandable that Apple would have had to develop its benchmark for its M1-based computers if none of the current benchmarks accurately reflect the performance of M1-based computers.you really can't successfully make meaningful comparisons unless you have the same benchmark properly targeted at each architecture you are comparing.
If so, why did Apple compare its GPU to Nvidia's?It doesn't matter if an RTX 3090 is twice or half as fast as the M1 Ultra. You won't be using a 3090 with your Mac, period, and even if your application exists on both platform and works exactly the same, along with everything else you need .... you most probably will not switch from or to Windows or Mac. For this to happen one platform has to be substantially faster (meaning like twice as fast), and that consistently, at comparable prices.
It's probably the only thing PC people would understand ?If so, why did Apple compare its GPU to Nvidia's?
I doubt it, PC people measure hardware performance in frames per second in their favorite game.It's probably the only thing PC people would understand ?
Apple is courting switchers (y’all remember that ad campaign?) so it makes sense in that light to compare performance against a PC.It's probably the only thing PC people would understand ?
It should show up towards the bottom of the benchmarks page.
I think tech press has to understand an important fact: now that Apple is running their very own architecture - which is not only completely not x86, completely not CUDA and definitively not AVX, but also very different from other ARM chips - you really can't successfully make meaningful comparisons unless you have the same benchmark properly targeted at each architecture you are comparing. And even if you have that: how actually meaningful is that? Can I go and run Cyberpunk 2077 on my M1 Ultra? No. Can I go and run Final Cut on my AMD 5950X? No(t really). Apple, for better or for worse, has purposefully (again) cut ties with the rest of the personal computing industry.
The one meaningful comparison you can make: does this Mac run this software faster than the other, and if so by how much. That's the only meaningful metric here. It doesn't matter if an RTX 3090 is twice or half as fast as the M1 Ultra. You won't be using a 3090 with your Mac, period, and even if your application exists on both platform and works exactly the same, along with everything else you need .... you most probably will not switch from or to Windows or Mac. For this to happen one platform has to be substantially faster (meaning like twice as fast), and that consistently, at comparable prices.
Thanks for the reply. I am unable to make the picture bigger, but if I am reading it correctly, that shows "Wildlife Extreme" which I did not think was the same as "Wildlife Unlimited?" The unlimited options are for comparing mobile devices that have many different screen sizes and resolutions available, was meant to even the playing field. The 'Extreme' benchmarks are intended for testing higher resolutions and have a fixed resolution. I would think that this would not be directly comparable to an 'unlimited' score? Thanks in advance for your helpView attachment 1981416
Under fire strike.
It shows up as a "custom" option when you click the benchmark.Thanks for the reply. I am unable to make the picture bigger, but if I am reading it correctly, that shows "Wildlife Extreme" which I did not think was the same as "Wildlife Unlimited?" The unlimited options are for comparing mobile devices that have many different screen sizes and resolutions available, was meant to even the playing field. The 'Extreme' benchmarks are intended for testing higher resolutions and have a fixed resolution. I would think that this would not be directly comparable to an 'unlimited' score? Thanks in advance for your help
if you use Cinema4D. That’s what it was designed for — to gauge how fast a particular machine will render C4D projects.
Wow, they didn't even run Cinebench. I can't remember the last time that a Youtuber didn't lazily used Cinebench as the end all benchmark.