Apple Silicon clusters to power AI?

Velli · Sep 7, 2025

I know little about multi-CPU and how to spread computing over multiple systems. But, as a layman, it seems to me that AI computing is about how many computing units (CPU/GPU) you can throw at it. We know NVIDIA is the name of the game in AI. However, Apple Silicon delivers more power per watt, which is not only important for environmental reasons, but for heat dissipation which I imagine is important in datacenters.

This got me thinking: Could Apple be designing future Apple Silicon generations to functions as clusters, so that you could have a Mac Pro hold one or more boards with a bunch of Apple Silicon chips? Kind of similar to the Afterburner card, but stacked with AS chips? It seems logical to me that Apple would go that way, and it would provide an interesting use case for Mac Pro, but are there techical reasons this would be a Bad Idea™️?

maflynn · Sep 8, 2025

Could they do this? Sure. I've mentioned in various places, that with the unified memory architecture, apple has advantages over nvidia in some ways. Why not lean into that and build an AI mac. The Mac Pro is only whithering on the vine, why not re-design/redeploy it as an AI specific computer as you mention

!!! · Sep 8, 2025

Apple Silicon's ML cores use lower precision, and don't support certain layers. Presumably the biggest reason they are so efficient is because of the half-precision, which isn't desirable when training.

To be honest, I don't think Apple really cares about that segment. They don't make Xserves anymore, OS X Server is discontinued, and new versions of macOS are hostile to automation and remote management. Plus, I'd imagine that they'd want as much of their TSMC capacity going towards iPhone chips.

Even if they did try to compete with NVIDIA on a hardware level, they have very little in the way of software. They do have Metal, which is more than can be said for AMD, but most developers do not bother with it. CUDA is used everywhere, it's supported by all the major ML libraries.

I'd love to see a modern Xserve with a ton of e-cores, though. For workloads that need many low-power cores.

leman · Sep 9, 2025

This is a complex topic.

When we talk about the efficiency of Apple hardware, we usually talk about CPUs (and to lesser degree GPUs). For AI work, Apple GPUs are currently much less efficient than Nvidia (since the later has dedicated AI processing units).

Apple currently has two technologies for AI-related processing. One is the matrix coprocessor integrated with the CPU (AMX/SVE), and another is the streaming ML accelerator (NPU). The NPU is indeed very power-efficient, but generally optimized for some common use cases like on-device image processing and other small models. It is not clear that either of these technologies can be scaled meaningfully to data center use. Btw, there is a lot of evidence that Apple is working on GPU matrix acceleration units, which might even get announced today at the event.

The big question is what kind of workflows do you have in mind when you talk about data centers and AI, and what would be the usage model? I don’t see much sense for Apple in competing with Nvidia in the general data center space, but they might have custom AI hardware for in-house use (especially with their focus on secure AI). They already are supposed to use M2 chips for st least some server-side priced ding, even though tust family is not the best fit for AI work.

I do expect Apple computers to become much more attractive for local ML work and development in the coming years. They don’t need to reach Nvidia’s levels performance to become a good choice for AI scientists and engineers either. Just enough to be usable, and there are some smart optimizations they are working on that will likely allow them to punch above their weight.

Velli · Sep 9, 2025

Thanks. I don’t have any particular use in mind, my interest in this area of Apple hardware is purely out of curiosity. I just find the idea of the amount of computing Apple could fill into a Mac Pro fascinating, compared to the size of an NVIDIA card.

deconstruct60 · Sep 10, 2025

Velli said:
I know little about multi-CPU and how to spread computing over multiple systems. But, as a layman, it seems to me that AI computing is about how many computing units (CPU/GPU) you can throw at it. We know NVIDIA is the name of the game in AI. However, Apple Silicon delivers more power per watt, which is not only important for environmental reasons, but for heat dissipation which I imagine is important in datacenters.

Relative to other datacenter targeted solutions, Apple Silicon's I/O bandwidth capabilities are substantively lagging. So the cluster's internodal communication would be seriously limited versus other options. Apple's Silicon is more largely aimed at local "AI inference" rather than non local training and/or accommodating the largest possible model.

Additionally, the datacenter folks are drifting to a delusion that power consumption doesn't matter. ( just rent out a whole nuclear plant if you need too). If not sucking every last drop out of the local energy grid then you are "being left behind". I strongly suspect Apple isn't going to 'give in' to the 'fear of missing out' mania and chase max power consuming data centers for themselves. There are have been a couple of AI hardware startups whose funding dried up because they were following the 'power efficient' product tract. But also don't think they are going to drive "max power consuming" products for general customers either.

There are some rumors that Apple is going to make a purely self consumption chip.

" ... At least a trio of companies are believed to be involved with the chip. Apple is said to be handling the overall design of the chip, while Broadcom is said to be providing some networking technology for it. TSMC is expected to begin mass production of the chip in 2026, using its third-generation 3nm process, known as N3P. ..."

Apple Intelligence Servers Expected to Get All-New, Turbocharged Chip

Apple Intelligence servers are currently powered by the M2 Ultra chip, and they are expected to start using M4 series chips next year. In an eventual move away from Mac chips for server use, The Information today reported that Apple is developing a new server chip that will offer even faster...

www.macrumors.com

Something specialized for private cloud compute nodes. Google has had a similar arrangement with Broadcom for their datacenter Tensor chips (which have relatively high node-to-node communication paths for cluster building).

Google Ironwood TPU Swings for Reasoning Model Leadership at Hot Chips 2025

At Hot Chips 2025, Google went into its Ironwood TPU packages and deployment in an awesome glimpse of the company's AI hardware prowess

www.servethehome.com

If Apple does a datacenter SoC with Broadcom, then most likely end up with a custom motherboard just like the Google Tensor. It isn't "card's for boxes with slots" , it is a specific logic board for server deployment. Not generic customer sales ( it won't be a 'prop' for Mac Pro sales. Or extremely coupled to the Mac Pro system component reuse at all. ).

Bringing the base communication bus of the 200-400 GbE ethernet ( or equivalent ) onto the SoC package itself also saves power. If power saving is a priority , it doesn't make sense to lend extra hard on general PCI-e connectors. (that isn't going to save you power consumption).

Velli said:
This got me thinking: Could Apple be designing future Apple Silicon generations to functions as clusters, so that you could have a Mac Pro hold one or more boards with a bunch of Apple Silicon chips? Kind of similar to the Afterburner card, but stacked with AS chips?

The Afterburner card has no aux power connectors. ( very much fits in with Apple's 'hate' of wires ). If cap this add-in card at 75 Watts then pushed into only getting the lower end Apple silicon dies. Those dies have the most restrictive I/O bandwidth ( minimalistic ports on Mac laptops , iPads , etc. is the primary focus. )

An 8-pin connector would easily open things up for a Max-class M-series chip. (maybe an Ultra). A 6-pin would need to combine the bus 75W with the molex, aux 75W to perhaps cover a M-series with incrementally more complicated power delivery infrastructure.

The current Mac Pro is a bit scaled back on aux power delivery though

"...
300W auxiliary power available:

Two 6-pin connectors delivering 75W of power each
One 8-pin connector delivering 150W of power

..."
https://support.apple.com/en-us/111343

If used a 8-pin , then you only get one card. Even with 6-pin only pragmatically getting 3 cards. ( which probably is enough for small clusters in a lab, but not particularly "datacenter" sized clusters. ). The MP 2019 had four 8-pin connectors [***] . ( the rumored "extreme" SoC likely was suppose to be the recipient of that missing power. ).

For bus power only 75W, you are more likely constrained to something at the "Pro" class M-series. By time you put a Ethernet port , one (maybe 2) Thunderbolt USB-C , cooling subsystem , Wi-Fi/Bluetooth, SSD drive , etc on the card then have used up power budget. They could trim some power by dropping most of that ( Wi-Fi , cooling system , Thunderbolt qualification ) , but would be missing out on a "Mac on a card" business that could span into the retail (non-datacenter only) space.

There are a number of folks who left Mac space to follow "box with slots" path on Windows. An card that could work independently in non-Apple system would likely sell more units. If it is a self-contained system with full MacOS there is no good reason why it could not work in heterogeneous contexts. Capping at Mn Pro SoC would also help to control costs ( so could sell more). Afterburner was priced so high that it only had a deeply niche market ( that evaporated once ProRes processing got mainstreamed in the line up. )

A Mac Pro only ( or primarily targeted ) card is likely doomed. The Mac Pro form factors are not a good data center solution , so won't work particularly well there either. If need a 'card' form factor OAM would be better than legacy PCI card form factor. Datacenter liquid cooling solutions are skewed in that form factor direction.

Velli said:
It seems logical to me that Apple would go that way, and it would provide an interesting use case for Mac Pro, but are there techical reasons this would be a Bad Idea™️?

There is a decent chance that Apple will do their own datacenter specific SoC. ( unless they completely punt medium-to-higher level Apple Intelligence out to 3rd party ( Gemini , Claude , etc). If they don't have largish inference models to run then not sure what revenue source they are going to fund that R&D with. ) . I don't buy the "doom and gloom" around Apple AI that is being pitched. So I suspect their internal model building will continue.

[****] Four cards , each with 3-4 Thunderbolt ports could replicate the clustering set up Apple already enables for the Mac Studio grouped in a pod of four. (each Mac Studio gets a point-to-point link with each of the other 3 in the cluster with no expensive cluster switch hardware required. That is more an "affordable lab" solution than a high scale datacenter solution. )

Not holding my breath for that. If Apple did do a "Mac on a Card" system then it likely would only be one of them. The 75W solution has more product breath.

deconstruct60 · Sep 10, 2025

Velli said:
Thanks. I don’t have any particular use in mind, my interest in this area of Apple hardware is purely out of curiosity. I just find the idea of the amount of computing Apple could fill into a Mac Pro fascinating, compared to the size of an NVIDIA card.

Large scale AI clusters don't have PCI-e form factor Nvidia cards.

Supermicro-NVIDIA-GB200-NVL72-Rack-Installed-for-Lambda-Front.jpg

Exploring the NVIDIA HGX B200 Lambda AI Cluster at Cologix with Supermicro

We take you on a tour of the Lambda AI cluster at Cologix with the latest-generation Supermicro NVIDIA HGX B200 servers

www.servethehome.com

AMD either

MiTAC G8825Z5 AMD Instinct MI325X 8-GPU Server Review

In our MiTAC G8825Z5 review, we see how this 8-GPU AMD Instinct MI325X server performs and how it was made in a very neat fashion

www.servethehome.com

Already linked in the Google Tensor article with server pictures in response above.

P.S. Next gen Nvidia Vera-Rubin compute tray

NVIDIA Rubin CPX is an AI GPU for Next-Gen NVIDIA AI GPUs

The NVIDIA Rubin CPX is an AI GPU for next-gen NVIDIA GPUs. Planned for 2026, NVIDIA will have heterogeneous GPU and memory types clustered

www.servethehome.com

two 'probes' out the 'top' of that compute tray are for fluid in/out for cooling. The Rubin 'cards' are insert OAM style. The CPX cards are similarly connected to main base logic board. The networking in in the ConnectX-9. Those are not minimized power consumption format.

deconstruct60 · Sep 10, 2025

maflynn said:
Could they do this? Sure. I've mentioned in various places, that with the unified memory architecture, apple has advantages over nvidia in some ways. Why not lean into that and build an AI mac. The Mac Pro is only whithering on the vine, why not re-design/redeploy it as an AI specific computer as you mention

Four fully maxed out Mac Studios can already be hooked up in a "lab cluster" style, right now. Connect each of the other three Studios with a point-to-point thunderbolt cable. In aggregate, you have 2TB of RAM ( in pools of 512GB).
[ Not particularly RAM error tolerant, but what is a few more hallucinations. 🙂 ]

Stacked 4 high with some separation the volume isn't that much more taller than a Mac Pro. Tilted sideways on a shelf not wider than a 19-inch standard rack. The inter nodes linkages are relatively slower, but it is much more affordable.

If Apple isn't going to go to a SoC that is 'Bigger than Ultra" then they probably should bring back four 8-pin connectors. But that really won't solve their data center scale issues. [ If Apple doesn't change their approach to chiplet design then probably should just give up on that. ]

Apple spent billions on a car-to-nowhere and thinks 10GbE is bleeding edge networking. Nvidia, AMD, Broadcom, etc. spent billions on high end networking.

A very highly datacenter specific SoC that is targeting AI training or super-mega inference models is unlikely going to help the the Mac Pro. First, that SoC probably is not going to be running macOS at all. Second, power and cooling objectives will be substantively different than Mac Pro.

Search

Search

Apple Silicon clusters to power AI?

Velli

macrumors 68000

maflynn

macrumors Broadwell

!!!

macrumors 65816

leman

macrumors Core

Velli

macrumors 68000

deconstruct60

macrumors G5

Apple Intelligence Servers Expected to Get All-New, Turbocharged Chip

Google Ironwood TPU Swings for Reasoning Model Leadership at Hot Chips 2025

deconstruct60

macrumors G5

Exploring the NVIDIA HGX B200 Lambda AI Cluster at Cologix with Supermicro

MiTAC G8825Z5 AMD Instinct MI325X 8-GPU Server Review

NVIDIA Rubin CPX is an AI GPU for Next-Gen NVIDIA AI GPUs

deconstruct60

macrumors G5

Our Staff