why build your own server in Houston if you don't use it
It is likely a matter of where does the 'overflow' go. The downside of using. Google ( or Anthronpic Claude or OpenAI ) is that they are in a 'arms race' to be the largest , multimode modal models possible. Everything and the kitchen sink thrown in. If Apple is going to occasional dispatch a subset of user requests to something like that then the Apple servers probably won't have the 'chops' to handle that.
Apple's servers are not all that heavy weight compute.
Apple begins shipping AI servers from Houston factory
Apple has started shipping artificial intelligence servers built in a factory in Houston, it said on Thursday, part of the company's plans to invest $600 billion in the U.S. in the next few years.
Also.
https://www.cnbc.com/2025/10/23/apple-american-made-ai-servers-texas.html
Not much of a huge networking connection there (doesn't look like doing many-to-many direct connections in the rack. More so, it looks like data streams in from Apple device and and goes right back out. ) Also not much of Air throughput (if that second picture is the 'front' then no obvious liquid chilling connections. )
There is a pretty good chance Apple built something that has better performance/watt then the competition. But 'chatbot' AI has taken on more of a power consumption pissing contest where the objective is to burn as much power as possible. I don't think Apple has that or even wants to do that at all. A power consumption pissing contest is very likely going to be outsourced.
It is probably also a huge mistake to offer a super long list of very fancy features of Apple Intelligence entirely for free. I don't think Apple can wait forever before introducing a paid tier. If only the paid stuff is dispatched to Google servers then. PCC could be enough for all of the folks getting 'free' queries.
P.S. from the looks there is a pretty good chance that Apple's PCC servers are just repackaged and refactored Mac Studios ( stripped of ports and video output stuff that and end user would use and flatten out heat sink that covers a bigger 2D footprint than Apple would tolerate on a desktop. The 'slab' is big enough for two units ( i.e. 2+ Ultras and some auxiliary stuff. ). Revise the Mac Studio motherboard to make it easier to fit in the slab and stripping the parts they don't need (e.g. Wi-FI , USB ports , etc. )
If liquid cooling , then probably max 4 Ultra in a Apple's 'cluster' fashion. But that only scales in cluster compute to 4 nodes. Google TPU nodes are capped that narrow.
Last edited: