Apple to Power AI Features With M2 Ultra Servers

macfacts · May 9, 2024

HobeSoundDarryl said:
"Introducing our magical, incredible iCloud AI Pro Max... starting at only $69/month.

We think you'll love it!"

Kinda hard to charge subscription fees with on device AI, but they figured it out.

uacd · May 9, 2024

So, all these “on-device processing” rumors were just rumors I guess. I mean, I would have actually liked if iPhone could autonomously think out-of-the-box without requiring any Internet connection, at least for most basic stuff or if I could download certain “memory chunks” that I would need all the time.

Imagine asking Siri “how to make tasty sandwich” during power outage and it does not say “oh I am so sorry there is no Internet”. I mean it would have been cool. Not really much AI but would need lots of data anyway.

And yeah put some strain on those useless machine learning chips: I would have said “you know Siri, I have a better sandwich recipe, what do you think ‘bout it?”. Why? Machine gotta learn something!😀

smulji · May 9, 2024

coolfactor said:
What operating system will they run? macOS? A specialized build? I vote for the latter.

Linux

JordanCautious · May 9, 2024

I'm down with this. Sounds like a good idea. Wouldn't be surprised if they add it to the Apple One Subscription though

Smittywerben · May 9, 2024

TechnoMonk said:
3090 is useless in front of M2 Ultra with 192 GB unified memory. 3090/4090 can work well with toy models. Where did you get 400 GB BW for an ultra? It is 800 GBps on ultra.
And with out knowing Apple’s architecture, these numbers don’t mean much. Is Apple running all the inference in cloud or a hybrid approach where device runs most of inferences and uses cloud as needed? I didn’t compare to H200, not sure how you got that impression. lol.
What is unknown is will Apple use Ultra for training or inferences, I know Apple used to use Nvidia H100/A100 for training.

3090/4090 are gaming GPU's, designed for gaming not AI tasks unlike Hopper. it's like buying GTX 1660Ti and then complain that it can't run ray tracing, because it's simply not made for such task.
You were talking about your M1 Max not M2 Ultra, that's where 400GB/s came from.
I wasn't talking about Nvidia's "gaming" GPU's in the first place, not sure how you got that impression in a article about AI Features.
Hopefully this rumor is just one of those typical random BS that Gurman throws but we're almost halfway through 2024 and Siri can barely set a timer without saying "Here’s what I found on the web" .

HobeSoundDarryl · May 9, 2024

macfacts said:
Kinda hard to charge subscription fees with on device AI, but they figured it out.

"Only Apple is able to integrate AI on device and iCloud subscription upcharges. We think you'll love it!"

TechnoMonk · May 9, 2024

Smittywerben said:
3090/4090 are gaming GPU's, designed for gaming not AI tasks unlike Hopper. it's like buying GTX 1660Ti and then complain that it can't run ray tracing, because it's simply not made for such task.
You were talking about your M1 Max not M2 Ultra, that's where 400GB/s came from.
I wasn't talking about Nvidia's "gaming" GPU's in the first place, not sure how you got that impression in a article about AI Features.
Hopefully this rumor is just one of those typical random BS that Gurman throws but we're almost halfway through 2024 and Siri can barely set a timer without saying "Here’s what I found on the web" .

You are all over the place lol. Were you comparing M1 Max with H100 or H200? I bought up ultra because of the server conversation. 4090 is a fair comparison for a prosumer workstation, with Max Max or even an Ultra. Either ways, Siri isn’t Apple AI/ML, I couldn’t care less if they can Siri tomorrow. AI at Apple is lot more than Siri, in fact you don’t really hear much about Siri in Apple AI literature/github or tech documentation.

falainber · May 9, 2024

andrewxgx said:
why pay nvidia 80% margin if you can use your own hardware?

you are also missing the obvious: the M2 Ultra is a niche product, and bigger chips (the so-called M1/M2 Extreme) were cancelled because of that.
now Apple has an incentive to built those as not only they can sell it in Mac Pro/Studio, they can run their own servers on it with massive cost saving compared to what AMD and Nvidia offers.

big chips were given green light if this rumor is true

Apple does not have any chips for servers. Neither M2 Ultra nor M4 have the I/O and RAM memory scaled properly for the servers. Another problem is that nobody develops any AI for Apple chips. It's mostly NVIDIA. I would understand if Apple were developing the servers to deploy in the cloud for third party software developers to develop AI solutions for Apple hardware but generic AI on Apple chips? That sounds... strange.

TVreporter · May 9, 2024

iAppleOrchard said:
Hoping this means that my M1 MacBook Air (that I might stick with until M4 now) can get AI features after all.

You must be new here…

They couldn’t even do that for a pencil. No way they do it for A1. Tim says buy new hardware!

jakebrosy · May 9, 2024

BC2009 said:
I don't see how any chip could "inherently protect user privacy" -- that is just nonsense. Privacy is primarily a function of the software running on the chip. While there may be features of a chip that could help protect privacy in a multi-tenant environment (like a cloud server), that would be at a very low level such as protecting memory from being read across processes or threads.

Patent double-talk bs from Apple.

jakebrosy · May 9, 2024

blastdoor said:
Yup — I think that’s huge. The marginal cost to Apple of another M2 Ultra is just the fab cost they pay to TSMC. I could imagine it would be ten times more expensive to get the same computational power from Nvidia.

I also presume Apple knows very well how to optimize software to their own hardware. So they will have highly optimized software running on much cheaper hardware and the scale support all that effort

But WHO is making the software?

Is it Apple AI?

Or will they partner with a 3rd party to run their software on Apple hardware?

ProbablyDylan · May 9, 2024

shadowboi said:
So, all these “on-device processing” rumors were just rumors I guess. I mean, I would have actually liked if iPhone could autonomously think out-of-the-box without requiring any Internet connection, at least for most basic stuff or if I could download certain “memory chunks” that I would need all the time.

I imagine some work can be done on-device in the background, like summarizing messages, emails and the like. Stuff like Siri, where there's a user waiting for a response, will need to go to a server to be processed faster.

ProbablyDylan · May 9, 2024

ghanwani said:
You can use it to assist you, but if you don't know what you're doing and AI outputs garbage, you will think the garbage is good and consume it. In other words, even though AI may be capable of doing things you cannot, unless you know what the expected output is, you may end up doing things completely wrong...which is worse than not doing them at all.

Right. Generative AI is just a tool for making whatever it is you're making. Like a hammer for nails. If you don't know what you're doing in either instance you stand to harm yourself.

Smittywerben · May 9, 2024

TechnoMonk said:
You are all over the place lol. Were you comparing M1 Max with H100 or H200? I bought up ultra because of the server conversation. 4090 is a fair comparison for a prosumer workstation, with Max Max or even an Ultra. Either ways, Siri isn’t Apple AI/ML, I couldn’t care less if they can Siri tomorrow. AI at Apple is lot more than Siri, in fact you don’t really hear much about Siri in Apple AI literature/github or tech documentation.

You are still talking about a gaming GPU and comparing it to your basement workstation, guess who is all over the place.
Apple is trying to compete in AI with H200's and soon B200's with their outdated chips (at least according to this article).
Siri is just a living corpse, ofc we don't hear about it.

rgeneral · May 9, 2024

Smittywerben said:
I was talking about their h200 and soon b200 series not gaming GPU's lol.
h200 VRAM: Up To 141 GB HBM3e @ 6.5 Gbps
And that's called unified ram, not VRAM specifically where you have to waste memory on everything which maxes out at 400 GB/s BW. I'm pretty sure that's even slower than RTX 3090's VRAM (as expected ofc).
Apple is currently busy selling their last generation $1800 laptop with 8GB ram, which has less RAM than majority of android phones. I don't think those glorious days will come anytime soon.
Sure those GPU's much more expensive but M2 Ultra is like 31.6 TOPS meanwhile h200 is 3900ish TOPS and soon b200 with 20,000 TOPS.
Comparing M2 Ultra to h200 in AI is more like comparing GT 710 to RTX 4090 in gaming.

What is they can attach many M2 Ultra together. Maybe they can now. How much is a h200 compare to one ultra.

jorgeandresf · May 9, 2024

Smittywerben said:
Maybe just use Nvidia GPU's that are like 50 times faster and are actually built for this type of workloads ?

Same chip architecture... both Apple and Nvidia chips use ARM base chips designs, apple decided to focus in efficiency and thermal management, so apple is more than capable to design AI chips

SoldOnApple · May 9, 2024

Probably worth remembering too that Apple has promised to be carbon neutral and rolling out solar power for their operations. Nvidia 4090 or whatever may have more power in certain situations, but if you are paying for all of that in green energy then Apples M chips start looking more attractive. Also helps Apple insulate from the AI gold rush that is coming and will drive other component prices sky high in price.

IIGS User · May 9, 2024

Apple needs to do something with Siri pretty soon. It really is about useless to the point of being a joke with the exception of bungling text messages sent while driving.

Smittywerben · May 9, 2024

rgeneral said:
What is they can attach many M2 Ultra together. Maybe they can now. How much is a h200 compare to one ultra.

They could do the same with GPU.
The right question is, how many M2 Ultra does it take to reach b200 in AI performance ?

bradman83 · May 9, 2024

vertsix said:
Seems solid to scale.

A14 Bionic (iPad 10): 11 Trillion operations per second (OPS)

A15 Bionic (iPhone SE/13/14/14 Plus, iPad mini 6): 15.8 Trillion OPS

M2, M2 Pro, M2 Max (iPad Air, Vision Pro, MacBook Air, Mac mini, Mac Studio): 15.8 Trillion OPS

A16 Bionic (iPhone 15/15 Plus): 17 Trillion OPS

M3, M3 Pro, M3 Max (iMac, MacBook Air, MacBook Pro): 18 Trillion OPS

M2 Ultra (Mac Studio, Mac Pro): 31.6 Trillion OPS

A17 Pro (iPhone 15 Pro/Pro Max): 35 Trillion OPS

M4 (iPad Pro 2024): 38 Trillion OPS

The TOPS benchmark metric isn't particularly useful because there's no standardization as to what constitutes an "operation per second."

The M4 does 38 TOPS using INT8 whereas the M3 does 18 TOPS using INT16, meaning it's analyzing double the data. Cut down to raw firepower the M4 only has about a 5% AI/ML advantage over the M3 when TOPS are equalized.

@Brett · May 9, 2024

Smittywerben said:
Maybe just use Nvidia GPU's that are like 50 times faster and are actually built for this type of workloads ?

I feel like Apple doing their own thing is better. Everyone is using Nvidia. Some competition is needed as they have. Bit of a stranglehold on the market. Hence their share price.

Smittywerben · May 9, 2024

@Brett said:
I feel like Apple doing their own thing is better. Everyone is using Nvidia. Some competition is needed as they have. Bit of a stranglehold on the market. Hence their share price.

Google is also using their own hardware for Gemini's LLM as far as I know, but they used TPU's, custom-designed for machine learning, not Chromebook hardware.

jimbobb24 · May 9, 2024

Smittywerben said:
Maybe just use Nvidia GPU's that are like 50 times faster and are actually built for this type of workloads ?

True but nvidia is currently charging a crazy premium.

jimbobb24 · May 9, 2024

cocky jeremy said:
It's the next 3D tv, AR/VR, or foldable phone. A dumb trend that no one will care about in a few years.

At this point this must be just trolling. AI from large models will be a component of every job and every creative profession in a few years. It’s like calling statistics a toy and fad.

Reason077 · May 9, 2024

It seems crazy to use M2 chips to power AI when the M3 and M4, by Apple's own admission, come with a hugely upgraded Neural Engine. What on earth are they thinking?! Maybe there's just a huge surplus of unsold M2 Ultra chips that they need to do something with...

Apple to Power AI Features With M2 Ultra Servers

macrumors 603

macrumors 65816

macrumors 68040

macrumors 6502

macrumors regular

macrumors G5

macrumors 68040

macrumors 68040

macrumors 68020

macrumors regular

macrumors regular

macrumors 68020

macrumors 68020

macrumors regular

macrumors 6502

macrumors newbie

macrumors 65816

macrumors 65816

macrumors regular

macrumors 68000

macrumors 6502

macrumors regular

macrumors 68040

macrumors 68040

macrumors 601

Our Staff