Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
The Ultra SoC is not fatally flawed in any way shape or form. It’s a use case scenario. Just because he doesn’t see the value doesn’t mean other don’t. I have the maxed out M3 Ultra and for me it’s an absolute bargain. 512GB of unified memory beats anything I could do for 5x the costs of just Nvidia GPUs. Nvidia would be faster with the same RAM but therein lies the problem. Have to cluster together many Nvidia systems to get the same amount of RAM. So for LLMs, advanced GPU needs, or video editing at scale, the M3 Ultra is very fairly priced. The only thing that isn’t fair is the storage price. But for $10k can get the M3 Ultra with 512GB of unified memory and 4TB of storage. Can’t beat that. Can even come close. Run LLMs locally. Nearly 2x the GPU performance. The video encoders and decoders are doubled. Forget the whole M3 vs M4 argument. It’s all just not relevant. If you don’t need an Ultra, you know it. If you do need an Ultra, you know it.
Local LLMs are nonsense. Quality low, Speed low.
 
Local LLMs are nonsense. Quality low, Speed low.
Not true at all. People completely missing the boat if they believe this statement. First, running models locally means securing sensitive data like financial or healthcare. Second, running agents locally and even basic searches get far more than 70 tokens per second on a 70b parameter model with thinking. Only thing low is the mindset of not understanding what is possible and how much money one can make right now running AI infrastructure locally. It’s an absolutely amazing business to be in. Someone with $10k in their pocket can easily make $150k per month right now with just a $10k Max Studio and a few days of learning.
 
Not true at all. People completely missing the boat if they believe this statement. First, running models locally means securing sensitive data like financial or healthcare. Second, running agents locally and even basic searches get far more than 70 tokens per second on a 70b parameter model with thinking. Only thing low is the mindset of not understanding what is possible and how much money one can make right now running AI infrastructure locally. It’s an absolutely amazing business to be in. Someone with $10k in their pocket can easily make $150k per month right now with just a $10k Max Studio and a few days of learning.
70b active parameter models run at about 6-8 tokens per second on Apple hardware. I am not sure what you do with that which will yield $150k a month.

Using this logic - I can run a 70b parameter model on Fireworks at 100 t/s for about 90 cents for a million tokens.
 
Last edited:
The Ultra SoC is not fatally flawed in any way shape or form. It’s a use case scenario. Just because he doesn’t see the value doesn’t mean other don’t.

I don't believe that is his position.

I have the maxed out M3 Ultra and for me it’s an absolute bargain. 512GB of unified memory beats anything I could do for 5x the costs of just Nvidia GPUs. Nvidia would be faster with the same RAM but therein lies the problem. Have to cluster together many Nvidia systems to get the same amount of RAM. So for LLMs, advanced GPU needs, or video editing at scale, the M3 Ultra is very fairly priced. The only thing that isn’t fair is the storage price. But for $10k can get the M3 Ultra with 512GB of unified memory and 4TB of storage. Can’t beat that. Can even come close. Run LLMs locally. Nearly 2x the GPU performance. The video encoders and decoders are doubled. Forget the whole M3 vs M4 argument. It’s all just not relevant. If you don’t need an Ultra, you know it. If you do need an Ultra, you know it.
Actually that is exactly the topic of this thread. I agree with just about all of the benefits you ascribe to the M3 Ultra. The fundamental question is: Why didn't Apple release an M4 Ultra? Let me ask you this: All else being equal what would you choose: An M3 Ultra? Or an M4 Ultra?
 
I don't believe that is his position.


Actually that is exactly the topic of this thread. I agree with just about all of the benefits you ascribe to the M3 Ultra. The fundamental question is: Why didn't Apple release an M4 Ultra? Let me ask you this: All else being equal what would you choose: An M3 Ultra? Or an M4 Ultra?
There is no M4 Ultra. Will I buy whatever is next with more RAM, yes. Until then, we can’t make Apple produce anything. I prefer Apple create a single package M5 Extreme SoC with 1TB of RAM or more. Happy to buy it. Rather it be in a Mac Pro just to be able to put a bunch of storage in it.
 
The fundamental question is: Why didn't Apple release an M4 Ultra? Let me ask you this: All else being equal what would you choose: An M3 Ultra? Or an M4 Ultra?

I prefer to buy what is available at the time I need to buy. I buy on features not promises. There is no M4 Ultra today. Your question is 100% irrelevant.

People who say they would have bought an M4 Ultra, but not an M3.. I don't believe would have bought either. They are just venting and looking for excuses not to buy. We see that all the time here.

The topic was worded as a question. As such it's fair game to discuss the other side. I bought the M3 Ultra based on what it could do. But I bought for my needs, not the OP's.

As for why Apple didnt produce an M4 ultra. It's all idle speculation. Only Apple knows. The answer doesnt change the reality that what we have is the M3 ultra. Does it do what you need it to do or not? Ultimately that is the question, no matter how much some might want to avoid it.
 
People who say they would have bought an M4 Ultra, but not an M3.. I don't believe would have bought either. They are just venting and looking for excuses not to buy. We see that all the time here.
That is a bold statement, because I would have bought the M4U, but in no way I am gonna buy a older gen M3U.
I need both: Singlecore AND Multicore power. Working with 250+ Logic Pro tracks here

My current M2U is just 21 months old, I can wait a bit longer for the M5 or M4U, I aint buying just for the sake of buying. It must be a REAL upgrade in both SC and MC power. The M4M can deliver that (both 20% more). I am at 50-75 load all of the time and working at 48khz, with an enough powerboost I can finally get the rig at 96 Khz.

Screenshot 2025-03-24 at 09.07.08.png
 
I have a question for people who know this better than me.

Here we discuss about the relative performance of M4M, M3U, M2U, etc, sort of based on various existing benchmark tests. Is it true that the multi-core test results somewhat depend on how the testing tasks can take advantage of the multiple cores and multiple threads? For example, it's known that Photoshop does a poor job in utilizing the multiple cores. So if the benchmark tests include some tasks like this, the M chips that have more cores will be penalized at least partially because of this.

On the other hand, on a workstation, it is possible that we throw multiple tasks to it, regardless whether each task is optimized for multiple cores. Even if the individual tasks are poorly optimized for multiple cores, can the sheer large number of cores of Ultra make the Ultra outperform Max more than what the benchmark tests would imply? For example, if multi-core scores of M3U is only 1.2 times higher than M4M (sounds like very poor value) because some of the test tasks are not well multithreaded, then maybe when we simply throw many different tasks on an M3U at the same time, can it be more than 1.3 times faster than M4M?
 
  • Like
Reactions: cl516
I was never in the market for an Ultra - was just after a M4 Max, so the update worked out well for me.
If I were looking for an Ultra I would have been disappointed though. Sure, the M3 Ultra is a new chip, but its just that number that makes it 'feel' older then it is!
 
  • Like
Reactions: JSRinUK
That is a bold statement, because I would have bought the M4U, but in no way I am gonna buy a older gen M3U.
I need both: Singlecore AND Multicore power. Working with 250+ Logic Pro tracks here

My current M2U is just 21 months old, I can wait a bit longer for the M5 or M4U, I aint buying just for the sake of buying. It must be a REAL upgrade in both SC and MC power. The M4M can deliver that (both 20% more). I am at 50-75 load all of the time and working at 48khz, with an enough powerboost I can finally get the rig at 96 Khz.

View attachment 2495209
It’s not older in any way. It just came out. You’re also talking about wanting M4 Ultra but Apple said there’s no interconnect for the M4 Max SoCs. And have told media that it doesn’t plan to release each model as an Ultra variant. I want the Extreme that isn’t 4x Max but one SoC designed with 1TB or more of RAM built into one package instead of interconnected. But anyone who sees and compares how the M3 Ultra or M4 Max can see that it destroys anything comparable money can buy. The M3 Ultra with 512GB of unified memory beats out anything less than $100k Nvidia due to limitations in VRAM.

Must buy what’s available to you and you can afford. Wait all you want but an M4 Ultra probably won’t happen. When the M5 Ultra or better happens, I will upgrade again as long as there’s more unified memory options. You know if you need more unified memory. And you know if you would be happy with a Max. It’s the individual buyer that decides. You want to wait, more power to you. But for people who want 512GB of unified memory for LLMs and etc, it’s available right now at a truly bargain price compared to any competition.
 
I have a question for people who know this better than me.

Here we discuss about the relative performance of M4M, M3U, M2U, etc, sort of based on various existing benchmark tests. Is it true that the multi-core test results somewhat depend on how the testing tasks can take advantage of the multiple cores and multiple threads? For example, it's known that Photoshop does a poor job in utilizing the multiple cores. So if the benchmark tests include some tasks like this, the M chips that have more cores will be penalized at least partially because of this.

On the other hand, on a workstation, it is possible that we throw multiple tasks to it, regardless whether each task is optimized for multiple cores. Even if the individual tasks are poorly optimized for multiple cores, can the sheer large number of cores of Ultra make the Ultra outperform Max more than what the benchmark tests would imply? For example, if multi-core scores of M3U is only 1.2 times higher than M4M (sounds like very poor value) because some of the test tasks are not well multithreaded, then maybe when we simply throw many different tasks on an M3U at the same time, can it be more than 1.3 times faster than M4M?
Short answer is you are correct.

If your workflow is designed to take advantage of every CPU (or GPU) core available, then the M3 Ultra will be faster than the M4 Max every time, even though individual cores on the M4 Max are a bit faster. The slight loss in single-threaded speed of the M3 Ultra is made up for by the presence of twice as many cores... if your workflow makes use of it.

A grocery store with 32 slightly slower cashiers can service more customers than a grocery store with 16 slightly faster cashiers, even if each individual customer takes a little longer per transaction.

If your workflow is not designed to take advantage of every CPU core available (for example, a Windows 11 virtual machine on Parallels limited to using 4 cores on Apple Silicon, regardless of how many your CPU has; or if you need the fastest HTML rendering, which, I believe, is mostly a single-threaded task), then you will likely find the M4 Max to be faster.

However, if you're doing multiple simultaneous instances of this unoptimized workflow, as a multiuser server might do, (as the above example, running multiple virtual machines all at the same time, which each one busy doing something, and each one individually only using 4 cores), then the M3 Ultra will then likely be faster with them all going full-out since each VM doesn't need to share cores with other VMs.

You have to pick the machine that's right for your work. If your work is an even mix of single-threaded things and multi-threaded machines, then compromises will have to be made. You have to pick faster single-threaded performance or faster mutli-threaded performance. And that increase in multi-threaded performance needs to be able to pay for the extra cost.

It's not uncommon for high core-count CPUs to run at lower clock speed than CPUs with a lower core-count, even when they're both the same generation, due to on-die heat constraints. Back in the Mac Pro trashcan era, you could get Xeon CPUs with 4-cores at 3.7GHz, 6 cores at 3.5GHz or 12 cores at 2.7GHz. They were all the same generation chip, but the speeds had to be dialed down with more cores to stay within the thermal envelope. Which model you bought would depend on your main kind of workflow.

The M3 Ultra versus M4 Max is really no different.

I personally chose the M4 Max because I wanted the "snappiest" Mac available, and my work is mostly not really multicore optimized — the majority of my work is text-based, office app type of work. But I want general system responsiveness, large complex PDFs to render fast, fast performance for my (one) Windows VM, etc., and top performance out of the two — but less frequently used — apps that do use as much CPU and GPU power as is available (Osirix MD and Falcon MD). While a Mac mini M4 Pro would have suited most of my needs, it would lag behind the M4 Max in Osirix and Falcon (mostly due to the 2x faster GPU in the Max over the Pro), and I also wanted the larger array of ports and display support offered by the Studio. I very strongly considered the M3 Ultra, but decided that it wouldn't have performed any better for me 95% of the time. And that 5% usage case when it would perform (significantly) better wasn't worth the 2x increase in price.
 
It's not uncommon for high core-count CPUs to run at lower clock speed than CPUs with a lower core-count, even when they're both the same generation, due to on-die heat constraints. Back in the Mac Pro trashcan era, you could get Xeon CPUs with 4-cores at 3.7GHz, 6 cores at 3.5GHz or 12 cores at 2.7GHz. They were all the same generation chip, but the speeds had to be dialed down with more cores to stay within the thermal envelope. Which model you bought would depend on your main kind of workflow.

Thank you. What you said is what I suspected (but did not dare to say for sure).

The paragraph quoted above brought back my memory. My current machine is an iMac Pro. When I decided which iMac Pro to purchase, I was not budget limited, and I just wanted to hit a good middle point between single-core and multi-core performance. So instead of going for the Xeon with highest core count, I opted for the one with fewer cores and higher clock rate. Now I wonder if I made the right decision. Here are the current version of Geekbench single-core scores for iMac Pro:

2.3 GHz (18 cores): 1359
2.5 GHz (14 cores): 1349
3.0 GHz (10 cores): 1339 <-- This is the one I have.
3.2 GHz (8 cores): 1306

This is entirely different from what I expected when I purchased it long time ago. It turns out, even for single-core performance, I should just purchase the 18-core model. The higher clock rates of the 10-core and 8-core variants do not lead to higher scores at all. It's true that the first three aren't that much different, but they are supposed to be very different given the very different clock rates, right? Do you have any insights on why the single-core scores for these iMac Pro do not reflect the difference in clock rates?

I am eagerly waiting for the official Geekbench results for the M3U and M4M Studio, so I can make a final call on which one to get.
 
  • Like
Reactions: rehkram
Thank you. What you said is what I suspected (but did not dare to say for sure).

The paragraph quoted above brought back my memory. My current machine is an iMac Pro. When I decided which iMac Pro to purchase, I was not budget limited, and I just wanted to hit a good middle point between single-core and multi-core performance. So instead of going for the Xeon with highest core count, I opted for the one with fewer cores and higher clock rate. Now I wonder if I made the right decision. Here are the current version of Geekbench single-core scores for iMac Pro:

2.3 GHz (18 cores): 1359
2.5 GHz (14 cores): 1349
3.0 GHz (10 cores): 1339 <-- This is the one I have.
3.2 GHz (8 cores): 1306

This is entirely different from what I expected when I purchased it long time ago. It turns out, even for single-core performance, I should just purchase the 18-core model. The higher clock rates of the 10-core and 8-core variants do not lead to higher scores at all. It's true that the first three aren't that much different, but they are supposed to be very different given the very different clock rates, right? Do you have any insights on why the single-core scores for these iMac Pro do not reflect the difference in clock rates?

I am eagerly waiting for the official Geekbench results for the M3U and M4M Studio, so I can make a final call on which one to get.
In the case of the iMac Pro Xeon chips, I suspect what is happening is that these clock speeds are the base, not boost, speeds. When running a high-intensity single core workload, they all probably boost to a similar speed, and the small difference could be due to a better thermal envelope on the higher core count chips. They can't sustain that speed on all cores at once, of course, but when only using one of the cores thermal throttling seems to happen a little later/less with the higher core count.
 
  • Like
Reactions: whwang
It must be a REAL upgrade in both SC and MC power.
For *you*, as I mentioned I care about single core perf enough for it to make a difference for me, but that isnt true for everyone. If you need more RAM or a big bump in multicore or GPU perf the M3U is a huge upgrade
 
Is it true that the multi-core test results somewhat depend on how the testing tasks can take advantage of the multiple cores and multiple threads?
Yes, and that's part of the problem with synthetic benchmarks.

This is clearly shown in the Geekbench Metal results for M4 Max v. M3 Ultra. The M3 Ultra critics point out that the Metal results are only slightly better than the M4 Max.

But... that may simply not be relevant to one's use case.

For example, LLMs are processed by multiplying vectors, essentially. The GPU cores on the M series are used for that. However, LLMs have no need for all the other things the GPU cores can do (for graphics.) But, the Geekbench Metal score has to be weighted for all those graphics instructions.

So one again, decisions on equipment have to be made by the use case, to make sense.
 
There is a whole industry of marketing firms who infiltrate IT forums like Reddit and MacRumors to promote new products from Apple and other companies by suggesting all kinds of things you can do with those products. It influences consumption.

Buyer beware.
 
This is entirely different from what I expected when I purchased it long time ago. It turns out, even for single-core performance, I should just purchase the 18-core model. The higher clock rates of the 10-core and 8-core variants do not lead to higher scores at all.
It's because the L3 cache on the 18 core Xeon is larger, and it's shared on all the cores. This isn't true for all processors.
 
  • Like
Reactions: whwang
It's because the L3 cache on the 18 core Xeon is larger, and it's shared on all the cores. This isn't true for all processors.

It (fast result for more cache despite lower clock) also isn't true for all workloads on the same processor, either, depends how "cache friendly" the workload is.

Benchmarks help give you an idea, but if your workload is nothing like the benchmark then....
 
  • Like
Reactions: Allen_Wentz
Why do people keep saying this? Aren't the YouTubers there to help people make informed decisions? I watched the Luki Miani video and I felt he was right on the money. He was fair, balanced, and essentially said what many here have been saying: The Ultra offers the best performance for specific use cases despite it using the previous generation technology. However it is his opinion, which appears to be shared by others in this forum, that the lack of current technology puts a damper on it and is setting it up for failure. Why is that "click-baity"?
You ask "Aren't the YouTubers there to help people make informed decisions?" and I say NO, the YouTubers are not there to help people make informed decisions. Exactly the opposite.

Most YouTubers post sensationalism for clicks: clickbaiting for entertainment value. Go to YouTube for the entertainment if that is how you like to spend your time, but realize that the YouTube sensationalism will falsely skew any attempt at "people making informed decisions."
 
I don't believe that is his position.


Actually that is exactly the topic of this thread. I agree with just about all of the benefits you ascribe to the M3 Ultra. The fundamental question is: Why didn't Apple release an M4 Ultra? Let me ask you this: All else being equal what would you choose: An M3 Ultra? Or an M4 Ultra?
Yours is a wrong question, because
A) All else is not equal, so making that a qualification is a false starting point. The question fails.
B) We have no M4 Ultra to compare against.

The M3 Ultra is a unique beast, the only M-series with its access to 500 GB of RAM of Unified Memory. For all we know building Ultras is so difficult [costly] Apple may intend to only offer Ultra every other M generation. Or Apple might deprecate Ultra in Studios forever after a new Mac Pro is released.

Sure we can fantasize about future Macs, but denigrating today's choices over something we know nothing about is IMO inappropriate.
 
Thank you. What you said is what I suspected (but did not dare to say for sure).

The paragraph quoted above brought back my memory. My current machine is an iMac Pro. When I decided which iMac Pro to purchase, I was not budget limited, and I just wanted to hit a good middle point between single-core and multi-core performance. So instead of going for the Xeon with highest core count, I opted for the one with fewer cores and higher clock rate. Now I wonder if I made the right decision. Here are the current version of Geekbench single-core scores for iMac Pro:

2.3 GHz (18 cores): 1359
2.5 GHz (14 cores): 1349
3.0 GHz (10 cores): 1339 <-- This is the one I have.
3.2 GHz (8 cores): 1306

This is entirely different from what I expected when I purchased it long time ago. It turns out, even for single-core performance, I should just purchase the 18-core model. The higher clock rates of the 10-core and 8-core variants do not lead to higher scores at all. It's true that the first three aren't that much different, but they are supposed to be very different given the very different clock rates, right? Do you have any insights on why the single-core scores for these iMac Pro do not reflect the difference in clock rates?

I am eagerly waiting for the official Geekbench results for the M3U and M4M Studio, so I can make a final call on which one to get.
IMO Geekbench is not where you should be looking. The various large differences between M3U and M4M will not be accurately described by Geekbench; unless of course you coincidentally have a workflow that Geekbench emulates. E.g. where does Geekbench account for the 4x as much RAM available to M3U users? That single parameter is all-important to some workflows - - but meaningless to others.

I suggest looking hard at what your 2025-2030 workflow may want to do, then plan your new box configuration accordingly. With special attention to ever-increasing RAM demands.
 
  • Like
Reactions: cl516 and whwang
I suggest looking hard at what your 2025-2030 workflow may want to do, then plan your new box configuration accordingly. With special attention to ever-increasing RAM demands.

Yes, of course. The main reason I am thinking about an M3U is its large RAM. My current iMac Pro has 128 GB, I occasionally hit this limit, not frequently, so it is OK thus far. It's very likely I will hit this limit more often in the next few years. So an M4M with 128 GB of RAM is not likely to remain sufficient for long.

Currently my options are:

1. Get an M4M Studio with max RAM and run it for two to four years, and then replace it with whatever available then (may be an M6M or M5U?).

2. Get an M3U Studio with 256 GB of RAM, and use it for 4 to 6 years.

#1 will be more expensive (after combining the cost of two generations of Studios), faster in many areas, last slightly longer than #2 (again, after combining two Studios), and probably be challenged by RAM limit from time to time.

#2 will be less expensive, RAM sufficient, but often a bit slower.

Currently I am inclined to #2, but I won't make a decision very soon.
 
  • Like
Reactions: Allen_Wentz
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.