Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
The real pros don't care about geekbench, gamers do. That's what GB is for. The pros look at specialized benchmarks relefor their tasks. Just a simple example: what good is M4 GB score if your job requires, say, 2TB of RAM?
So if you're not a chip capable of addressing 2TB of RAM you're a disposable game console cpu? I'm sorry, but that's the vast majority of the market. Yes, it would be nice if Apple had a chip that could address gobs of RAM, but implying those are the only CPUs that matter and everything else is a useless toy not worth bothering with is mind boggling.
 
  • Like
Reactions: CWallace
So if you're not a chip capable of addressing 2TB of RAM you're a disposable game console cpu? I'm sorry, but that's the vast majority of the market. Yes, it would be nice if Apple had a chip that could address gobs of RAM, but implying those are the only CPUs that matter and everything else is a useless toy not worth bothering with is mind boggling.
I did not imply that at all. That was just an example. There are many other computer/system characteristics that are not measured by GB. As another example, take a look at TPC benchmarks. You won't see Apple computers there at all (because they are not suitable for this type of jobs).
 
This is anecdotal, but I got a M1 Ultra a couple years back to test out training some large LLM models that wouldn’t fit on NVIDIA chips like an RTX 6000 Ada. The latter has 48Gb VRAM, while the M1 Ultra has ~98Gb VRAM effectively, albeit at much slower training speeds.

It did fine for a few models, but then something bizarre happened - anything with an odd numbered batch size returned only infinite gradients and nan losses. Even numbered batch sizes could kind of train, but gradients exploded frequently. I couldn’t replicate this on my NVIDIA hardware or even a Mac M3 Max chip.

Only a hypothesis on my end, but I worry that the dye connection between chips is fragile and prone to error, particularly with these demanding high throughput actions. This can lead to otherwise normal Mac behavior, but render it useless with AI applications. My Mac M1 is still under Apple Care, but I don’t know how to even begin to describe the problem to Apple support.
You should post your code somewhere like GitHub, and invite others to replicate your findings on both the M1 and M2 Ultra. It's possible the issue is fixed on the M2.
 
Last edited:
So you are saying you don't think Mac OS already has the capability to do task assigning for multiple cores?
No I know it has that, I am doubting whether Apple can actually scale it up to support that many cores.
The point has been made that an Mx Ultra is not twice as fast as a Max. This has everything to do with task assigning scalability. Now imagine Apple releasing a processor with four times the cores of a Max which still isn’t even twice as fast. I think that’s where the problem is…
 
Last edited:
Apple used to make systems with two sockets for CPUs. But back then the ram was separate on a bus somewhere. It could be that on-chip ram conflicts with multiple sockets, because a core accessing ram from the other chip would get results much more slowly than a core accessing ram on its own chip.

So essentially, programmers would have to deal with two different ram busses with different latencies. Possible, but complex, and only relevant for a small minority of Mac models.
 
No I know it has that, I am doubting whether Apple can actually scale it up to support that many cores.
The point has been made that an Mx Ultra is not twice as fast as a Max. This has everything to do with task assigning scalability. Now imagine Apple releasing a processor with four times the cores of a Max which still isn’t even twice as fast. I think that’s where the problem is…
No, the M2 Ultra isn't twice as fast as a Max, but depending on the task, it's not far off. Comparing M2 Max vs. M2 Ultra, the overall GB6 score is only 41% higher. What a lousy chip, right?

For PDF rendering it's 66%. Asset Compression is 70%. Photo Library is 73.4%. Ray-tracing is 75%. Clang (ie: compiling code) is 81.8%.

Sounds like it's still a worthwhile boost for developers, photographers, film makers. If an Extreme kept the same rate of performance improvements vs. the Max the performance vs the Max would be PDF: 2.75x, Asset: 2.89x, Photo: 3.01x, Ray Tracing: 3.06x, and Clang: 3.31x. These are all well over twice as fast.

What brings the average down? Text processing only got a 2% boost, but that's hardly a use case you'd get an Ultra for. Background Blur is just 1.7%. Perhaps to someone without a proper lens that's useful, if you've got Ultra money, you have a fast lens. Photo filter is just 4.5%. Google says that's social media style photo filters. Again, not a task aimed at the Ultra buyer.

Seems the Ultra actually does scale quite well on actual work, and just scores lower on simpler tasks, and non professional workloads. If the Extreme is more of the same, bring it on.


Now, before you go saying "who cares about asset compression and ray-tracing", those are the ONLY two tests where an 13900K or a threadripper 3970X beats an M4 Max. If you downplay those tests, you're flat out admitting Apple is killing it, and that's before the M4 Ultra, never mind an Extreme.

 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.