Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Every cloud vendor/AI training company of any note is starting to develop their own chips to reduce reliance on Nvidia's massive margins, but every AI training company of any note also just has to buy Nvidia to keep up, with the massive libraries already built out in CUDA to build on. The more specific chips offload specific things at less power.

Apple probably is buying H100s/B100s, but doesn't want to say they are, with the years old spat with Nvidia. Curiously, Jen-Hsun started mentioning a few times recently after all those years, so sounds like things have improved.
 
The irony here is that the Nuvia chip team bought by Qualcomm, reportedly left Apple because Apple wouldn't let them work on a server chip so they left and started their company. Qualcomm of course is probably already a few beats ahead with the server chips,

Qualcomm? Not really. When Broadcomm came sniffing around looking to acquire them Qualcomm largely chucked their server efforts into order to make their balance sheet look better and by side-effect make acquisition much more difficult to do. The scared off Broadcomm ( which looking at what is going on a VMWare was likely a good thing in general. )

" ... Starting in June, Microsoft began hiring a number of former Qualcomm engineers and managers who are now working on Redmond's quantum-computing team. A number of these engineers were formerly working on Qualcomm's ARM-based server efforts, as first reported on October 8 by The Information. ..."

also



While Qualcomm has had a architecture license from Arm, they were largely just letting it collect dust in a corner while they fended off Broadcomm and dealt with strategic shifts in the radio industry.

Qualcomm has been getting some limited traction in server room Inference.





From appearances Qualcomm is more focused on dragging Nuvia cores into their phone SoCs more so than server stuff. Qualcomm made some 'plan b' gestures toward using Nuvia cores for servers during the early stages of the Arm lawsuit esculation. Decent chance that was a deliberate tactic to stall Arm's formal filing against them and Qualcomm worked on blocking the Nvidia deal for Arm (which was bad from multiple dimensions. Not just Qualcomm's view. )

If Ampere Computing had imploded and Arm bungled Neoverse , then perhaps Qualcomm would have jumped into. But Ampere has gotten substantive traction. Whether they can survive off the Neoverse platform fully custom is unclear. Neoverse is doing well (lots of different buyers). It is going to somewhat cheaper for their hyperscalars to all share common R&D expenses than to run off into totally proprietary niches for 'bulk' computing needs.

UCI-e is going to make it even tougher. In the future , Arm likely going to have the essential central chiplet available off the shelf and folks can just attach whatever custom accelerator they want.


and with the Nuvia-modified server chips landing on Windows laptops later this year, Apple is going to have strong competition.

Nuvia never had server chips. They were shipping nothing when they got bought. What is here in Snapdragon X is at least as much Qualcomm work ( GPU , NPU/AI/DSP , memory , etc. ) as it is Nuvia. The cores here have a lot of server core elements baked into them. They didn't throw everything out and start over right away. It is likely partially why the nominal cores here is 12 not 8 or 4 . And why not the particularly focused on single threaded drag racing.

They are not going to try to match up core count to core count + exact power budget with Apple. Their die is between the plan Mn and Mn Pro. And just trying to clear Intel's iGPUs.



But as Johny Srouji likes to tell everyone in every interview, "Apple is not a chip company". Which is an odd thing to keep saying when you work for a computer company.

They don't sell chips. Srouji has one, and only one, client. Apple product design teams. That's it.
It is unclear how much Apple is into Open Compute Foundation and Open server designs for their data centers. ( versus just buying lots of 'off the shelf' stuff from general market server board vendors like Supermicro , Inspur, etc. And using OCF common chassis designs that can get from others with very little Apple design effort. ). If there is no substantive "Apple datacenter design team", then Srouji doesn't really have a 'client' to talk to.

There are lots of other contributors to OCF ( Facebook , Google , etc. ) that all the variations spun out on the open market are enough for Apple to use without doing anything particularly unique for their datacenters. Apple is not a hypercalar shop so there are not going to be tons of servers to spread custom server components R&D over.

Amazon is not a "computer company" and makes their own Chips, DPUs, and server components. They don't sell them. They rent them.
 
Last edited:
They're technically not a chip company. AMD, Intel, Qualcomm are chip companies (i.e.: they sell processors to other hardware OEM's). Apple's revenue doesn't come from selling processors but from finished products.

Errr.


aapl-1q24-pie.jpg


https://www.macrumors.com/2024/02/01/apple-1q-2024-earnings/


Services is bigger than Mac and iPad combined.

If the EU , US DoJ , and some other government agencies kill off a decent chunk of the App Store profits, then that will shrink, but Apple just doesn't' sell hardware at this point.

TV+ is not solely hooked to Apple products. Neither is "Apple Radio/iTunes". There is a growing part of their business is 'renting access'. That is much more sustainable than trying to get a lunatic fringe to toss their iPhone every year and buy a new one.
 
Last edited:
  • Like
Reactions: tenthousandthings
Every cloud vendor/AI training company of any note is starting to develop their own chips to reduce reliance on Nvidia's massive margins, but every AI training company of any note also just has to buy Nvidia to keep up, with the massive libraries already built out in CUDA to build on. The more specific chips offload specific things at less power.

They are not only looking to offload specific fragments.

" ...
US energy provider Exelon has calculated that power demand from datacenters in the Chicago area is set to increase ninefold, in more evidence that AI adoption is will put further strain electricity supplies.
..."


The path of blindly following Nvidia has more problems than just margins. It isn't scaling. There is crazy talk of building new Nuclear plants just to run AI. Welcome to the "Matrix" ( AI takes over dominations of the power grid).


Apple probably is buying H100s/B100s, but doesn't want to say they are, with the years old spat with Nvidia. Curiously, Jen-Hsun started mentioning a few times recently after all those years, so sounds like things have improved.

Apple isn't running their SAP supply chain or the corporate financials on some lowly Mac Pro ( larger Sun servers ) . Years past Apple had a Cray to design stuff. "Eat your own dog food" turned into completely rigid dogma is past where Apple has ever been.

I doubt though that Apple is building a deep a CUDA moat as possible around what they are doing though. ( Google doesn't build as large a CUDA moat as possible around their own stuff. ) They'll likely be able to dump Nvidia whenever another 3rd party vendor comes along.

Part of the problem at this point is the "49'er Gold Rush" fever around things at this point. Folks are trying to max out the win before the bubble bursts. Or the more recent 'toilet paper' shortages early in the pandemic. Everyoen is buying it because everyone is buying it.

Nvidia just doesn't sell H100 cards. They sell complete systems ( CPU , networking, GPU all in a box ). Apple could just buy some of that and be done. It is just a 'black box' . I'm sure there are some Macs at some internal parts of Nvidia also.
 
  • Like
Reactions: Chuckeee
Being a good datacenter AI server processor has no "Mac" property at its kernel. So labeling it a superset is not right connotation. the 'core' of the essential set is not 'Mac-ish'.
Sorry, no idea what you are saying here. My point on the "superset" was whatever technology AI chips use to parallelize work across co-processors would not even be necessary for a desktop machine (although it would be nice!).

If you're talking a literal OS kernel, both an AI server running Linux and MacOS have Unix kernels at their heart. That's what makes them so cross-fertile.
 
I cannot wait to see the WWDC presentation where Johny Srouji tells us all about the new Apple ASi AI ecosystem, all the while Chuck Norris-walking thru the full basement of the Apple Mothership which is nothing but rack after rack after rack of said Apple ASi AI servers...!
  • Apple ASi Mn Extreme Mac Pro Cube
  • Apple iCloudAI subscription service
  • Apple ASi AI server farm
This rumor says the AI Server chip isn't coming out until 2025. WWDC 2024 is likely going to be focused on on-device AI. In WWDC 2025, likely still going to be the same primary focus; just incrementally better. Highly unlikely Apple is going to come back in 2025 and hype up how developers are suppose to forget all that on-device stuff... all the nifty stuff is in our data center now. Apple doesn't generally flip-flop like that over relatively brief periods of time.

Apple is really not big on pictures of the insides of their data centers. There isn't a big datacenter in Cupertino.

I never said which year for WWDC...

And have you personally toured the entirety of the Apple Mothership; there is a whole lot of square footage there, who knows what lies within...?
 
Apple actually acknowledged that they were working on an Apple car, this AI server chip still just an unsubstantiated rumor. So the Apple car was more believable (that sounds so weird)
True, but Apple's hand was forced when their job postings started looking like:
  • 5+ yrs experience with node.js
  • 3+ yrs experience with python
  • 10+ years of experience with NHTSA crash tests
For AI, it's a lot easier since everyone can claim to be an expert.
 
  • Haha
Reactions: Chuckeee
This leak is probably designed to shore up Apple's share price.

The list of things Apple was building but recently gave up on is much longer than the often mentioned ones. The modem and MicroLED projects for example. With all the negative news around Vision Pro, Apple really needs to steady the share price when they announce the latest quarter on May 2nd. Saying they are going to compete in AI server business is perfect for that.
 
As an XServe administrator I have to say that a server with the current Silicon architecture has three main problems:
1. No modularity
2. An unstable operating system with a low level of integration with any server technology.
3. No foresight of continuity in the product line.

The manufacturer must finally give very good reasons to make me reconsider this opinion about their server product line.

Therefore, I cannot advise the purchase of Apple server equipment.
 
  • Like
Reactions: betabeta
At least Blue-ray!

Whoa there...! Steve told us Blu-ray was a "bag of hurt"...

As an XServe administrator I have to say that a server with the current Silicon architecture has three main problems:
1. No modularity
2. An unstable operating system with a low level of integration with any server technology.
3. No foresight of continuity in the product line.

The manufacturer must finally give very good reasons to make me reconsider this opinion about their server product line.

Therefore, I cannot advise the purchase of Apple server equipment.

The rumor seems to be that Apple will produce their own servers intended for their own AI server farm, not as end-user/consumer/business purchases by the general public...

As far as modularity, I would think a blade server form-factor would be as modular as it gets; oh, and most likely redundant PSUs...?
 
  • Haha
Reactions: Razorpit
And have you personally toured the entirety of the Apple Mothership; there is a whole lot of square footage there, who knows what lies within...?

Square footage without the necessarily electrical substation and HVAC to provide power and cooling doesn't matter. Don't have to look at the inside to know it isn't there.

There is a substation there for about the building cluster that is visibly present.

Besides before Apple went into the substantive upswing on building data centers their primary one was in Fremont; not Cupertino. (moving between old-2-new buildings in Cupertino wouldn't perturb that at all ).



or

 
This leak is probably designed to shore up Apple's share price.


I was listening to Bloomberg radio a day or so after the "M4 will focus on AI ability" story. Gurman was interviewed for a segment and expressed substantive surprise that the stock had bounced up as high as it did after the story came out. I kind of wondered if he was that much of a "Goober" or was just providing air cover for his Apple sources.

This next round on AI Server stuff , it seems more likely he is just playing along.


The list of things Apple was building but recently gave up on is much longer than the often mentioned ones. The modem and MicroLED projects for example.

The modem didn't seem to come with job cuts. ( not seeing anything about big drops in Munich, EU, or San Diego headcounts. ). Nor seen much on Apple dropping their buiding/facilities expansion

There is a difference between 'gave up' and 'much harder than planned'. If Apple threw 'almost everything' out and started over from almost scratch it probably would take longer than planned. Lots of expectations where that Apple would throw Qualcomm out relatively quickly. That was always dubious.

Intel was stumbling and bungling when Apple shipped the M1. Intel helped them get a pass on performance for the transition. Qualcomm isn't. They aren't missing a beat the last couple of years. Apple is likely going to need to ship something that is 'behind' before they get a shot at getting close to parity. The SE4 drift isn't helping (maybe a chicken-or-the-egg thing going on there a bit).



With all the negative news around Vision Pro, Apple really needs to steady the share price when they announce the latest quarter on May 2nd. Saying they are going to compete in AI server business is perfect for that.

Vision Pro isn't all that negative. If go back to some of Kao's forecasts eariler it was mid 100's of thousands zone. Someone at Apple has been in the 700K-1M hype zone to suppliers but I suspect nobody really believed those.
Its position in the AR headset market is very much on the high side which was not going to lead to "Millions of units" shipped.

There are always some folks with ADHD chasing manic waves of Apple stock for lightweight analysis reasons. VPro was never going to be a huge short term stock burst anymore than the Watch was. Some folks threw shade at the Watch for years as it grew into a solid player/contriubtor to the revenue flow over time. VPro isn't going to be any different.

The DoJ coming in and taking away the $4B/yr 'free money' that Google gives Apple would be for more substantive than anything VPro will do good or bad for next couple of years.

Nothing here is really saying the are going to 'compete' in the AI server business at all. Apple doesn't have a general 'host computation in cloud' services business at all ( they are not competing with Amazon Web Services in any substantive way). XCode Cloud isn't substantially that. And the Siri 'backend' isn't that either.
 
Square footage without the necessarily electrical substation and HVAC to provide power and cooling doesn't matter. Don't have to look at the inside to know it isn't there.

There is a substation there for about the building cluster that is visibly present.

Besides before Apple went into the substantive upswing on building data centers their primary one was in Fremont; not Cupertino. (moving between old-2-new buildings in Cupertino wouldn't perturb that at all ).



or


Come on, you have to admit having a massive Apple ASi-powered data center dedicated to all things Apple/iCloud in the lowest levels of the Mothership would be pretty frakking cool...!
 
Every cloud vendor/AI training company of any note is starting to develop their own chips to reduce reliance on Nvidia's massive margins, but every AI training company of any note also just has to buy Nvidia to keep up, with the massive libraries already built out in CUDA to build on. The more specific chips offload specific things at less power.

Recent post is that Apple just released some on-device models hosted on Huggie face.

" ... the LLMs are available on the Hugging Face Hub, a community for sharing AI code. ..."

The CUDA moat is substantive, but if put some effort to put a boat in the water and paddle across ... it is just a moat; not 'magic' .


AMD and Hugging Face doing lots of work to port their stuff to non Nvidia , MIx00 hardware.

"...
Can you spot AMD-specific code changes below? Don't hurt your eyes, there's none compared to running on NVIDIA GPUs ..."

Not true of everything in their repository , but CUDA only on everything isn't true either.


Apple probably is buying H100s/B100s, but doesn't want to say they are, with the years old spat with Nvidia.

Apple has probably burned a bit of bridge with AMD .... but if they asked for a stack of MI200 and some early MI300s they likely would be able to get them if paid cash far in front of AMD's production runs. MI300 missing a PCI-e card option likely means Apple likely wasn't productizing it for the Mac Pro , but Apple was eyeball deep in Vega for a while and the CDNA is a bit of a continuation from that ( along with the working relationship Apple had with RDNA1-2 ).

If Apple has been trying to put the most efficient x86 server hardware into their datacenters over the last 2-3 years, there is a decent chance that has been AMD stuff , not Intel stuff. I'd bet that Apple has bought some Ampere Computing stuff also. Didn't have to wait for Nvidia to do either one of those. ( actually Nvidia's 'off the shelf' super computer node in the last generation was AMD Epyc based. ) . They could buy a Cray(HPE) Slingshot box that has AMD CPU and GPUs in it pretty much "off the shelf" at the 'Cray Store'. :)



Apple didn't have to blindly hand big checks over the Nvidia. If that was how they solely spent all their 'AI' money on that (and their own narrow AI closed source silo ) would be kind of dumb. Doing some investing in more portability and vendor independent development would help their niche also.
 
Come on, you have to admit having a massive Apple ASi-powered data center dedicated to all things Apple/iCloud in the lowest levels of the Mothership would be pretty frakking cool...!

if the Mothership was in Minneapolis perhaps. [ The original Cray company had problems selling one of their 'old' buildings there when they moved to a new one because the building had no furnace. The 'demo' Crays they had in the basement where the 'furnace' in the Winter time. ' ]

Once Apple let Jony Ive run off and go 'buck wild' on the HQ spaceship with wasn't going to be about doing something infrastructure practical like datacenter. [ I suspect some of the core engineering buildings off the side of the campus where hopefully spared . I wouldn't let Ive within a mile of something that had mission critical technical requirements.
( 'car' with no steering wheel ... *cough* can see how that turned out. )
 
Nvidia leading in server chips? Not really. AI/ML training card that plug into a server being run by a server chip, yes Nvidia is dominant (but the field has multiple players) . But the main processor in the server, Nvidia is both relatively late to the game and clearly not the only player. Ampere, Amazon , Microsoft , Google. etc. Are all players. Arm has a very viable server core in the Neoverse family. Nvidia is using. Amazon, Microsoft, other hyperscalers are using it. Ampere Computing was/is using it ( transitioning to a custom core for future generation. That may or may not work out for them if all of their major customers just keeping buying Arm's version. )

In AI/ML inference, Nvidia absolute does not have exclusive hold on the market. Inference and training do not have to be done on the same hardware.

MobileEye does AI/ML inference to help automatic car safety features. ( millions of cars with no Nvidia ).

If peak minimal latency is required several inference workloads run solely on the CPU if possible. ( copying the data out to the Nvidia card and back is time. ). One reason why Intel has thrown highly skewed AVX-512 and "DL boost' at the Xeon SP processors to try to backstop some of there competitive losses in server space.

Similarly.
https://www.tomshardware.com/tech-i...-supply-dollar752-million-in-ai-chips-instead

Similarly,
" ... probably setting the stage for what we are calling the AmpereOne-3 chip, which is our name for it and which is etched in 3 nanometer (3N to be precise) processes from TSMC. We think this will be using a modified A2+ core. Wittich confirmed to us that a future AmpereOne chip was in fact using the 3N process and was at TSMC right now being etched as it moves towards its launch. And then he told us that this future chip would have 256 cores. He did not get into chiplet architectures, but did say that Ampere Computing was using the UCI-Express in-socket variant of PCI-Express as its chiplet interconnect for future designs. ...
...
There are a lot of possibilities, but we know one thing: Ampere Computing is hell bent on capturing as much AI inference on its CPUs as is technically feasible.
... "

With an UCI-express inference if Apple wanted to put some of their NPUs on a chiplet (also with a UCI-e interface) and package it together for some custom inference Apple wouldn't have to build a whole server chip and some narrow custom mods to software to invoke the accelerator to offload perhaps more custom inference workloads.
( Arm is using on Neoverse with UCI-express also. )

By second half 2025, Ampere Computing could be on their second generation N3 Arm server chip aimed at inference.
( I'm a bit skeptical they will keep that yearly cadence. )

Finally on the inference front, Google is rolling out Gemini Nano. Apple is doing tons of AI inference in the Vision PRo. The whole Apple line up is reported to be doing more local inference in next versions of iOS/iPadOS/macOS. That is 100's of millions of devices where there is zero Nvidia in sight. Nvidia having AI/ML inference in some kind of unilateral monopoly hold is a complete farce. The AI/ML inference market is far, far , far bigger than the 'largest memory footprint possible', LLM model.
Can Apple Silicon be used in parallel? For example, it sounds like the most robust single offering from Nvidia is more powerful than a single AS chip (for obvious reasons). But what if multiple AS chips were used in parallel — wouldn’t this meet or exceed the Nvidia offering, while possibly using less energy? I have practically zero knowledge on the topic, but it seems like Apple is in the best position to produce something more powerful than what Nvidia has for their own needs.
 
Can Apple Silicon be used in parallel? For example, it sounds like the most robust single offering from Nvidia is more powerful than a single AS chip (for obvious reasons). But what if multiple AS chips were used in parallel — wouldn’t this meet or exceed the Nvidia offering, while possibly using less energy?

Likely no in many situations. That connection between the mulitple Apple SoCs will consume more power to transfer data than the large Nvidia systems can do more locally without longer external links. Apple perhaps could do it 'cheaper' (more cost effectively) but often not quicker.

For example, in a demo at WWDC 2022

"... Using this communication, the worker processes synchronize gradients before each iteration. I'll show this in action using four Mac Studios connected to each other with Thunderbolt cables. For this example, I will train ResNet, a classifier for images. The bar to the side of each Mac Studio shows the GPU utilization while training this network. For a single Mac Studio, the performance is about 200 images per second. When I add another Mac Studio connected via Thunderbolt, the performance almost doubles to 400 images per second since both GPUs are utilized to the fullest. Finally, when I connect two more Mac Studios, the performance is elevated to 800 images per second. This is almost linear scaling on your compute bound training workloads. ..."

Networking those four Mac Studios together with Thunderbolt 4 cables it vastly cheaper than buying NVLink and/or Infiniband connection equipment. The data center larger Nvidia accelerator can be coupled to its piers also. So there is nothing "exclusive" that Apple has going "parallel" because Nvidia is just as equally capable of adding more nodes to go parallel also. Just not at the same costs.

I haven't looked lately but macOS doesn't have Infiniband drivers. Nor stuff like > 100Gb Ethernet drivers either. Nvilink to do rack local connections? Nope.

In 2019, Nvidia bought Mellanox


About the time Apple was ramping on switch out Intel for M-series, Nvidia was jumping in eyeball deep into to hard core datacenter very high speed networking. Exactly the stuff you use to hook together "SuperComputer" nodes to "go parallel". Apple's move is more so getting Thunderbolt to do more affordable 10Gb Ethernet without a modest cost 10GbE Switch.

Apple's solutions are very single person or small work group oriented. The pressing issue is who are these Apple "AI" servers doing work for and how many folks is it. If using Nvidia's stuff to do work for 2-3 users per day then it doesn't energy cost per user doesn't work out so well. If it is 20,000 users you get a different Enegy/person ratio.

A different angle would be hosting AI/ML training for specialized domain specific ( modify a phot , generate some music ) for inclusion into Mac/iPhone apps using Apple's CoreML toolchains. Probably have relatively small cluster of developers working on one app and perhaps using XCode cloud to do build/integrate/test work iterations. For a model that fits in 100GB or so ( inside a M2 Ultra) Apple probably does it more energy efficiently. If split a 400GB model over the equivalent four Mac Studios could drive up in the 400GB range using techniques that Apple outlined two years ago. Might be slower but if this is all done 'switchless' with point to point networking over plain Thundebolt cables the cloud $/compute-hour-used is likely substantially lower too.


'Scale up' inside a rack (filling a whole factor of compute engines ) Apple doesn't really have much exclusive leverage there. 'Scale out' over a cluster spanning multiple racks... even less.


Where Apple may have some traction is in some almost completely decoupled parallel contexts. Where there is almost no interaction at all between the parallel tasks that need to be accomplished. For example inferencing on a 100GB (compressed) model on behave of millions of 8GB (or less) constrained iPhones. If there is some system to evenly distribute the individual iPhone workloads over thousands of machines ( where 10-30 per machine ). Each machine has its own complete local copy of the model ( so very little node-to-node communication inside the cluster and relatively slow ( in datacenter backhaul speeds context ) cluster-to-iPhone data rates. )

If the model size handling capacity fits inside the M2 Ultra system but has to spill out of the Nvidia systems then Apple has some advantages here. But is mainly because the work on single workload request isn't being split; it happens to stay all local. ( still individual or small workgroups. The parallel trick is to spread out the individual/small groups over the cluster and happend to have almost all the data they need pre-positioned there ahead of time. )

That isn't going to be generally competitive with the range of Nvidia's solution abilities. But for that corner case ... it could work.



I have practically zero knowledge on the topic, but it seems like Apple is in the best position to produce something more powerful than what Nvidia has for their own needs.

They are not in best position at all in the high end datacenter context at all. Connectivity wise they are way ,way , way behind. Cluster File systems ... years behind. Apple used to have stuff like Xgrid and clustering software, but that was all left behind years ago. Sever software .. again de-emphasized and atrophied over many years.

The best they have is some quirk corner cases that will happen to work because it largely looks like an individual workstation kind of problem that doesn't scale up or out past a handful (or less) number of nodes.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.