Apple Unveils New Mac Pro With M2 Ultra Chip and More

smulji · Jun 5, 2023

Unregistered 4U said:
For MOST people buying their first computing systems this year? An iPhone would be enough

Not just their first but probably their only computer system ever.

Unregistered 4U · Jun 5, 2023

rhett7660 said:
Question, since I don't know, are these apps written for the Mac or windows based applications ported over to the Mac?

I would be curious on the audio and video production apps, as I haven't heard about those bringing the Mac to it's knees. Interesting.

There are a LOT of apps that are not written to take advantage of Apple Silicon. And, as a result, they will run poorly, even when ported over. Primarily because, being cross platform, the developers don’t want to expend a lot of effort to make things work properly.

DeepIn2U · Jun 5, 2023

Unregistered 4U said:
Those rumors were from folks that weren’t paying attention for the last almost three years now (or that valued social media attention highly). Apple’s been telegraphing what Apple Silicon will be this entire time. There will be a baseline processor, and every more performant tier will differ by number of cores. And, by the time of the Mac Studio, there were still those thinking that whatever the Mac Pro was was NOT going to follow a very clearly set out pattern.

It’s not a knee-jerk move, it’s what it was going to be all along. It’s not supposed to be 1 level below a fully functioning server, it’s supposed the be the fastest Mac someone, that WANTS a Mac, can buy, which also happens to offer PCIe slots as a feature.

Considering that the Mac Pro at it’s HIGHEST yearly unit sales likely never amounts to over one half of 1% of Apple’s yearly revenues, the sales of the Mac Pro, good OR bad, won’t have a material affect on Apple’s bottom line.

I've been painstakingly looking for a conversation less than 2yrs ago that either you or someone else was deeply involved in on these forums going on about Apple's interposer for M1 Ultra that could be doubled in a 3D die implementation, aka M1 Extreme or M2 Extreme yet unable to find it currently (maybe based on my search criteria).

Something similar ...

altaic said:
Yeah, I didn't miss the Mac Pro bit. Very interesting.

The other thing that grabbed my attention was the 2.5 TB/s (UltraFusion!) interconnect bandwidth (at 27:55). Johny then said that it's "more than 4 times the bandwidth of the leading multi-chip interconnect," which I'd think refers to AMD's Infinity Fabric 3.0 at 400 GB/s bidirectional. Oddly, though, he could have also said "more than 6 times the bandwidth," so I'm not sure what to make of that.

Anyway, the M1 Ultra has 800 GB/s memory bandwidth, 400 GB/s per die. So, 2.5 TB/s die-to-die seems... excessive. I guess there's some kind of cache unification that requires enormous throughput. Or perhaps the interconnect is designed to handle more dies in a different configuration.

Edit: After ruminating a bit, I think that Johny’s “more than 4 times the bandwidth” statement may be a hint at the next-gen interconnect/interposer.

Grabbing a napkin… Since each die can do 2.5 TB/s, and four dies would require six direct interconnects to be fully connected (3 on each die), each interconnect would handle 2.5/3 TB/s = 833 GB/s. Infinity Fabric can do 400 GB/s switched, so 800 GB/s total in a quad setup, and thus an average of 800/4 GB/s = 200 GB/s overall. 4 * 200 < 833 < 5 * 200, QED… Yeah, that’s the best I could massage the numbers.

Edit 2: I looked into Infinity Fabric some more and while it’s not switched, it’s unclear to me what the actual bandwidth would be in different configurations and my numbers above might be correct… ish. It seems like there are a lot of rules for different configurations.

Anyway, I saw a twitter post on March 8 by a Japanese reverse engineering firm TechanaLye indicating that they’d analyzed the UltraFusion region on the M1 Max. Nice die shots, but, sadly, the details would be in one of their paid reports.

So there was absolutely NO truth to this?

Mago said:
Gotcha, Gurman haa no actual sources at TSMC neither the Mac Pro r&d team.

Wrong, but never arranged as a 4 tile squared window, it actually looks as a 4 Chip strip (I prefer to name it a dominoes). It's UltraFusion it's daisy-chainable north/south, m2 max requires memory interconnected at its sides, a 4 tile arrangement to block two channels front each SOC.

It's difficult without exposing the source, but not just UF+ has north/south path, it's bias are not at m2 Max's Edges but close to m2 center, likely m2 extreme/Ultra UltraFusion bridge being more like a carpet where an M2 Max lies with north and south interface for additional CPUs, Even it may resemble Nvidia 4 GPU nvlink arrangement.

A thing that intrigues few engineer with access to the same sources is not just M2 max have inbuilt UltraFusion provisions, m2 pro seems too , maybe not to daisy chain m2 pro but for other added capabilities on later devices as PCIe5 buses Even dGPU, it's hard to guess why m2 pro also include something which seems an lower rank UltraFusion.

Edit: m1 extreme also briefly existed it was based on a quite Long bridge where two soc at each side connected each other, besides expensive it had memory related issues which late doomed it.

I'm curious on your thoughts regarding what was stated in this post here:

NT1440 said:
From my understanding of the current situation is that much of the software still isn’t coded to actually be able to get data to the GPU fast enough, so the Ultra never gets to actually show its uplift.

There are workflows that do showcase phenomenal performance, but that seems to only be from vendors who have re-architected how their software works.

https://twitter.com/x/status/1514295682777059329

deconstruct60 said:
Apple ran a whole entire session at 2022 about optimizing and scaling GPU code in Applications.

Scale compute workloads across Apple GPUs - WWDC22 - Videos - Apple Developer

Discover how you can create compute workloads that scale efficiently across Apple GPUs. Learn how to saturate the GPU by improving your...

developer.apple.com

It was not just the "chip engineers" that expect app developers to do their jobs well. It is closer to Apple expects developers to do their jobs well. Apple has rolled out more tooling to help with doing optimizations. With the tools and the tutorials it should more tractable for the less lazy to do something at this point. ( at least for the two die "Ultra" class solution. ). Apple expects developers to optimize their apps.

Apple is not particularly likely going to go do power bleeding, triple backward hardware somersaults trying to make badly optimized code run faster. The hardware is there. If the developers are using bubble sort where quick/merge would work better that it isn't Apple's job to 'fix' that.

Is Apple going to get perfect linear scaling with zero code optimizations across 2 and then 4 dies ? Probably not. AMD and Nvidia aren't with monolithic dies either.

Even if Apple made some improvements in "UltraFusion 2" to smooth out some very highly sensitive NUMA characteristics between 2 dies , it would likely pop back up at the 4 die coupling stage. So the "issue" isn't going to completely go away with hardware covering up dubious code assumptions.

CPU code there really isn't a huge problem for well crafted scaling algorithms. (e.g., the NASATruss benchmarks on Apple's Studio marketing page scale. the Adobe stuff doesn't. That is not an surprising at all. Not even in the slightest. Adobe is relatively very slow to optimization the bulk of their code base. That is not an hardware issue in the slightest. )

M2 or folk can use better Xcode tools and tutorials. Can't solve this issue solely with hardware. It time at least as much as hardware.

The M1 Extreme likely would have several other problematical issues besides GPU scaling. Economics ( four largish dies, multiple interposer fusion chips, more expensive packaging. etc. and yet much, much, much lower volumes. ) . Apple probably needs a die that isn't focused on being a MBP 16" chip. ( e.g. 4 TB controllers per die at the stage of a 4 die package is extremely likely at least 8 more TB controllers than you need. ) .

Do a 4 die package with TSMC N3 ( or N4P ) would make lots more sense to manage the overall package size. It M2 isn't bring magic sprinkles but should/could be done with far more appropriate tech that is independent of the microarchitectural issues. Bringing the Extreme back under the 300W zone will help with the operational environment for the package.

Unless Apple had a major addition for PCI-e v4 provision the M1 Extreme was also likely weak in the area PCI-e provisioning for workstation class jobs.

.

But ... by your assessment the Mac Studio, at that time - below - was THE fastest Mac. Considering by default Apple has shipped every Mac Pro (except the 'Trash Can Mac Pro') and even the PowerMac G5 from their online retail stores as rack-mountable - this means it was considered a server (PowerMac G5 until Xserve debuted). Considering the PowerMac and the Mac Pro's would be used in a full professional studio and during concerts it is 1 level below a server. That is until Apple's own OSX software tools were dropped (sad day that was).

Unregistered 4U said:
I don’t even expect it to double the power of the Mac Studio. The Mac Studio, after all, IS currently the fastest Mac Apple makes and faster than the Macs that came before it. Even if it’s only 20% faster, it’ll be the fastest Mac yet and, for those who want/need the fastest Mac, that’s what they’ll get.

In my mind, the differentiators will be related to RAM, storage, physical port options and other things above/beyond just CPU/GPU performance (maybe more ProRes encoders/decoders, stuff like that). I’m not thinking “how could this beat a Mac Studio”. I’m thinking more like, “Who, specifically, are the very few that need something that the Mac Studio doesn’t offer as options… and how many of that small group are not going to like what Apple presents?” I truly expect that some users that are waiting to see what it is (and have plans to buy it) will not like what they see because it drops some old “Mac Pro” expectation and they may drop macOS orrr… just use their Intel box until it dies. And I believe Apple’s factored in this loss of what can’t be more than a few thousand at this point.

I'm bothered that both the M1 and M2 have 800 GB/s memory bandwidth, 400 GB/s per die - I kind-of expected a slight speed improvement this generation but sadly no.

The fastest Mac - it matches the M2 Ultra Mac Studio - it simply has user accessible PCIe ports. That's the only differentiator. Although it'll be $40K cheaper on the top end configuration than the outgoing Intel Xeon based Mac Pro from 2022: https://www.theverge.com/2023/6/5/23750154/apple-m2-ultra-mac-pro-cheaper-intel-mac-pro

ddhhddhh2 · Jun 5, 2023

fafner said:
For raw power the RTX4090 is still the king, but don’t forget that Nvidias most powerful card RTX4090 has only 24GB and the RTX8000 has 48GB, if you want to do AI with large models having direct access to 192GB is huge. We know that some of the parallel performance on Apple silicon is stunning like fluid simulation http://hrtapps.com/blogs/20220427/

And the IO when working with video is best in class when running a lot of video simultaneou. Also substance painter can easily eat 24GB of video memory if you use a many layers…

You know? When it comes to Nvidia's product lineup, I gotta say, the real deal for AI/ML development is the H100/A100. Yeah, I know everyone's all hyped up about the RTX and using it as the benchmark. Sure, it's cheaper and easier to get your hands on, but whether it's the RTX or the H/A cards, they're both placing some hefty bets.

Unregistered 4U · Jun 5, 2023

Michael Scrip said:
BUT... if Apple allowed NVidia RTX or Quadro cards to work in the new Mac Pro... they would sell a ton of them to the kinds of people who need CUDA and other things. Right now those customers are forced into Windows machines.

The Mac Pro only appeals to a small niche market... so why are they making it even smaller?

🤔

They wouldn’t sell a ton of them because the Mac part is secondary to the far more important NVidia part. And, NVidia parts are ALWAYS going to be cheaper in a Windows box. They’re forced to Windows now because there’s no NVidia on Mac. They’d STILL be forced to Windows due to the Mac prices.

I think another issue here is that NVidia has been VERY successful pushing their proprietary solutions such that, if NVidia existed on the Mac, people would have a good reason not to write performant code for Apple’s ML cores… just tell folks that they need to buy NVidia.

ojfl · Jun 5, 2023

seabasstin said:
why would you need to boot from external enclosure? you should not have any files other than your os on the system drive anyway so even 256GB should be enough to run system fast definitely store files on TB drives. none of this are new problems either? all recent macs aside from the last Mac Pro had these limitations

Just a matter of convenience built over the years. We would have to move all of the user directories to external disks. Certainly doable, but it will not be a simple Mac-to-Mac transfer, and it will not be possible to do it using the Mac software.

Unregistered 4U · Jun 5, 2023

sunny5 said:
The bandwidth of M2 Ultra is WAY slower than workstation GPU which is totally meaningless. Having more VRAM doesn't really mean it's faster and there are so many factors to consider. Beside, Apple GPU itself is way slower than RTX 30 series so VRAM doesn't mean better or faster.

I think you are correct, I was also unable to find any non-Apple GPU’s that have access to 128 GB of RAM. So, for those folks that have work that REQUIRES that much VRAM, Apple’s the only game in town. Having more VRAM absolutely doesn’t mean it’s faster. Having more VRAM just means it runs in the first place.

Unregistered 4U · Jun 5, 2023

smulji said:
Agree, or disagree, Vision Pro is now Apple's halo product. That's their vision of the future.

OH, actually, just read the post at investopedia which says that “The halo effect is a term for a consumer's favoritism toward a line of products due to positive experiences with other products by this maker.” Which, in that case, it’s likely been the iPhone for quite awhile. Guess I never knew what “halo product” was!

sunny5 · Jun 5, 2023

Unregistered 4U said:
I think you are correct, I was also unable to find any non-Apple GPU’s that have access to 128 GB of RAM. So, for those folks that have work that REQUIRES that much VRAM, Apple’s the only game in town. Having more VRAM absolutely doesn’t mean it’s faster. Having more VRAM just means it runs in the first place.

Having more VRAM is meaningless when bandwidth is much slower, GPU core performance is slow, and consume too less power. It's like having more RAM will give more performance.

Btw, Nvidia already have 80GB of VRAM and can expand as many as possible which is WAY more than 128GB of RAM. Apple Silicon cant really do that.

Unregistered 4U · Jun 5, 2023

DeepIn2U said:
The fastest Mac - it matches the M2 Ultra Mac Studio - it simply has user accessible PCIe ports. That's the only differentiator. Although it'll be $40K cheaper on the top end configuration than the outgoing Intel Xeon based Mac Pro from 2022: https://www.theverge.com/2023/6/5/23750154/apple-m2-ultra-mac-pro-cheaper-intel-mac-pro

Yeah, that post certainly hit the nail on the head!

Unregistered 4U · Jun 5, 2023

sunny5 said:
Having more VRAM is meaningless when bandwidth is much slower, GPU core performance is slow, and consume too less power. It's like having more RAM will give more performance.

Btw, Nvidia already have 80GB of VRAM and can expand as many as possible which is WAY more than 128GB of RAM. Apple Silicon cant really do that.

Nvidia has a single card with 128 Gigs of RAM on it? I wasn’t able to find it, what’s the part number?

Having more VRAM is meaningless if the use case doesn’t require it, certainly! If the use case can be worked with chunks of RAM smaller than 80 (and I’d imagine most Nvidia use cases are written to require far less contiguous RAM than that for obvious reasons), then it would make sense (financial and otherwise) for a user to leverage that solution.

sunny5 · Jun 5, 2023

Unregistered 4U said:
Nvidia has a single card with 128 Gigs of RAM on it? I wasn’t able to find it, what’s the part number?

Having more VRAM is meaningless if the use case doesn’t require it, certainly! If the use case can be worked with chunks of RAM smaller than 80 (and I’d imagine most Nvidia use cases are written to require far less contiguous RAM than that for obvious reasons), then it would make sense (financial and otherwise) for a user to leverage that solution.

I said 80GB of VRAM which is A100. You are ignoring that that Apple Silicon's unified memory works differently. And like I said, PC has way faster bandwidth which already outperforms Apple Silicon and they can just add GPU whatever they want which can go beyond 128GB of VRAM.

Since Apple GPU itself has a poor GPU performance, I wouldn't expect too much about it.

scottrichardson · Jun 5, 2023

guzhogi said:
Kinda sad the PCIe slots are only PCIe 4, not 5 or 6.

Got to remember that the M2 architecture is that of the A15 era iPhone chips. Go back another 18-24 months prior and that is when this generation of chips was being created, and at that time, PCIe4 was only just being baked into the design. So I understand why this M2 Ultra only has PCIe4, as it's basically a 4 year old architecture if you account for R&D cycle etc... I imagine M3 generation will be PCIe5.

steve123 · Jun 5, 2023

smulji said:
Agree, or disagree, Vision Pro is now Apple's halo product. That's their vision of the future.

Gosh, I hope not, nevertheless I have to agree with you, it looks like they have hitched their wagon.

ddhhddhh2 · Jun 5, 2023

sunny5 said:
I said 80GB of VRAM which is A100. You are ignoring that that Apple Silicon's unified memory works differently. And like I said, PC has way faster bandwidth which already outperforms Apple Silicon and they can just add GPU whatever they want which can go beyond 128GB of VRAM.

Since Apple GPU itself has a poor GPU performance, I wouldn't expect too much about it.

You know what, you're absolutely right about all that. It's a shame that some folks out there think those things ain't a big deal or "not important." Yeah, sure, we can argue all day about whether anyone really needs that much RAM or VRAM. I mean, I vaguely remember someone in the Stone Age saying that 16K of memory was more than enough (was it actually 16K?). But regardless of the viewpoint, my observation of Apple remains unchanged. I believe they're falling behind the trends of the times, especially in the AI/ML field. Now, I can't quite say it's sunset time for them yet, but AI/ML is definitely a rocket that's shooting up and has already reached great heights. And for some known or unknown reasons, Apple is clearly struggling in the hardware department. From Nvidia's perspective, the A100 specs are already outdated, but even with that kind of product, Apple can't even keep up, let alone compete with the H100 or the popular RTX in terms of computational power for AI/ML.

Yeah, if we don't focus on that aspect, then whether it's the M1, M2, or M3, it hardly makes a difference. As long as Apple doesn't intend to hop on that train, it won't have much impact. But hey, I'm not gonna write them off completely. They've never been the first leg in a relay race. Also, I still have hope that Apple will step up their game in their own GPU and AI/ML computation. Because instead of waiting for macOS to reintroduce compatible Intel/Nvidia/AMD AI/ML accelerators, it's better to wait for Apple to make some breakthroughs. Well, as long as folks still want to comfortably use macOS for work and develop AI/ML, okay, I guess we can make do, especially since the house we're living in is pretty cozy.

scottrichardson · Jun 5, 2023

This is basically entirely what the leaker 'amethyst' had shared with us all year. Same chassis. Ultra-class CPU. No RAM. PCIe Expansion slots.

It's Apple Silicon, with expansion.

I would have loved to see some 'compute' cards from Apple - and MAYBE they might still come.

Unregistered 4U · Jun 5, 2023

sunny5 said:
I said 80GB of VRAM which is A100. You are ignoring that that Apple Silicon's unified memory works differently. And like I said, PC has way faster bandwidth which already outperforms Apple Silicon and they can just add GPU whatever they want which can go beyond 128GB of VRAM.

Since Apple GPU itself has a poor GPU performance, I wouldn't expect too much about it.

Oh, no, I understand fairly well how the unified memory works. For example, the CPU can write a block to memory and the GPU can read the block, without the CPU having to queue a packet of data to shuffle across PCI first. Following that, the GPU can update the block, the CPU can read the result, then write a new value value in that block, have the GPU ready to read that, etc.

Developers that determine that this configuration meets their use case go on to take advantage of the considerable amount of contiguous unified memory that they have available to them. And, it’s available in the Mac Studio for $6,599.00… They don’t even have to spend $17,420.00 (plus the cost of a PC).

svish · Jun 5, 2023

Was not expecting to see the Mac Pro today!! Glad to see that it now has Apple silicon.

deconstruct60 · Jun 5, 2023

bcortens said:
No MPX slots means that they are probably not ever planning on supporting third party GPUs even as compute accelerators.

Errr. more likely that not every planning to support any APPLE compute accelerators. Apple is the one who had the deep aversion to AUX power cables. Not 3rd party card vendors. They do want AUX power connectors.

However, the Major reason for MPX connector though was to provision discrete Intel Thunderbolt controllers on the edge of the card. Pure compute accelerators probably are not going to come with any sockets. Sockets get in the way of better air flow and if trying to do maximum compute more airflow is better.

[ Graphics isn't necessarily the entire realm of Compute accelerators. ]

Discrete TB controllers inside of Apple systems probably is dead. It is far more power efficient to do it on the same die with the GPU and embedd a mini PCI-e controller in the TB controller. That is how Intel is doing it on their laptop line . It is how the rest of the M-series is doing it. There is good rational reason for Apple to invent a discrete TB controller just for one system in the line up. And few rational reasons to buy a TB controller from Intel when you already own an implementation yourself.

The TB controller moved onto the main die and MPX becomes a 'solution' in search of a problem.

P.S. MPX solely just for power .... maybe if the general PC market converges on a standard way around 16-pin 'drama'

Asus Demos RTX 4070 GPU With No Power Connectors on BTF Motherboard

Replacing the 16-pin power connector with a proprietary slot.

www.tomshardware.com

Apple could jump on board with that on a later board. I suspect there will be more effort to fix the 16-pin 'bugs' and folks will keep snaking cables through PC setups. ( won't get out of the "hmm, can make cards that only fit my boards... profit" stage).

opeter · Jun 5, 2023

kuau said:
I would say NO

sunny5 said:
There is no way you can add more RAM especially with Apple Silicon chip.

CWallace said:
I would expect not, and even if you could, it would be orders of magnitude slower than the on-package RAM.

It is possible that Apple looked into offering off-package RAM (via an additional memory controller) and found the performance to not be acceptable or it might have caused some type of issue that made it an undesirable path to follow.

Thanks, that's a bummer then.

ssgbryan · Jun 5, 2023

Unregistered 4U said:
I think you are correct, I was also unable to find any non-Apple GPU’s that have access to 128 GB of RAM. So, for those folks that have work that REQUIRES that much VRAM, Apple’s the only game in town. Having more VRAM absolutely doesn’t mean it’s faster. Having more VRAM just means it runs in the first place.

Actually, that is why we run multiple GPUs in our systems.

Even bottom of the product stack 3d software (Poser 13) will use as many GPUs as you can stuff in the case. I'm looking at getting another RTX 3060 to go in mine

MacHeritage · Jun 5, 2023

A few things I have noticed today:

1. The Mac Pro with max RAM has slipped into July already. When the Mac Studio with any combination hasn't. They are selling right now.

2. There are less thunderbolt ports in total, compared to the max you could configure on the 2019 Intel model. 10 ports total on the 2019 with two graphics cards and only 8 on the 2023 AS Mac Pro. So technically, Apple hasn't increased it by 2 they have lessened it by 2.

MacCheetah3 · Jun 5, 2023

ssgbryan said:
Actually, that is why we run multiple GPUs in our systems.

Even bottom of the product stack 3d software (Poser 13) will use as many GPUs as you can stuff in the case. I'm looking at getting another RTX 3060 to go in mine

With NVLink spanning memory load was somewhat possible (e.g., two linked 24GB VRAM GPU cards could manage 48GB of data). However, it was never widely adopted — presumably why SLI and NVLink are now EOL. Multi-GPUs (i.e., cards) are still plenty beneficial nowadays but it’s more of working in parallel (e.g., each renders a different frame or each processes a different simulation).

chfilm · Jun 5, 2023

Unregistered 4U said:
Do you think your 2019 system’s GPU outperforms the one Apple just released?

No probably not, but I also don’t believe that the new M2 ultra is as fast as let’s say if I would add the latest mpx modules available into my Mac Pro.

skippermonkey · Jun 6, 2023

rhett7660 said:
Question, since I don't know, are these apps written for the Mac or windows based applications ported over to the Mac?

I would be curious on the audio and video production apps, as I haven't heard about those bringing the Mac to it's knees. Interesting.

In the world of digital content creation, there are dozens, if not hundreds of native Apple apps and plugins that push the hardware to its limits. Anything that simulates inter-particle forces - like fluid dynamics - will quickly show you how powerful you think your CPU/GPU is. I’m not personally into audio, but I’ve seen examples where people have got tons of separate audio tracks layered up with complex effects, all playing concurrently in real-time. Doesn’t take much effort to see how you could easily stress the most powerful of systems. Likewise with video: throw in a dozen video layers in something like After Effects and add visual effects to them, and it very quickly stops being real-time. The very idea that ‘pro’ users can’t completely overwhelm the Mac Studio is beyond ludicrous.

Apple Unveils New Mac Pro With M2 Ultra Chip and More

Cancelled

macrumors G4

macrumors G5

macrumors 6502

macrumors G4

macrumors member

macrumors G4

macrumors G4

Suspended

macrumors G4

macrumors G4

Suspended

macrumors 6502a

macrumors 65816

macrumors 6502

macrumors 6502a

macrumors G4

macrumors P6

macrumors G5

macrumors 68030

macrumors 65816

macrumors 6502

macrumors 68030

macrumors 68040

macrumors 6502a

Our Staff