Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
So, all these “on-device processing” rumors were just rumors I guess. I mean, I would have actually liked if iPhone could autonomously think out-of-the-box without requiring any Internet connection, at least for most basic stuff or if I could download certain “memory chunks” that I would need all the time.

Imagine asking Siri “how to make tasty sandwich” during power outage and it does not say “oh I am so sorry there is no Internet”. I mean it would have been cool. Not really much AI but would need lots of data anyway.

And yeah put some strain on those useless machine learning chips: I would have said “you know Siri, I have a better sandwich recipe, what do you think ‘bout it?”. Why? Machine gotta learn something!😀
 
I'm down with this. Sounds like a good idea. Wouldn't be surprised if they add it to the Apple One Subscription though
 
3090 is useless in front of M2 Ultra with 192 GB unified memory. 3090/4090 can work well with toy models. Where did you get 400 GB BW for an ultra? It is 800 GBps on ultra.
And with out knowing Apple’s architecture, these numbers don’t mean much. Is Apple running all the inference in cloud or a hybrid approach where device runs most of inferences and uses cloud as needed? I didn’t compare to H200, not sure how you got that impression. lol.
What is unknown is will Apple use Ultra for training or inferences, I know Apple used to use Nvidia H100/A100 for training.
3090/4090 are gaming GPU's, designed for gaming not AI tasks unlike Hopper. it's like buying GTX 1660Ti and then complain that it can't run ray tracing, because it's simply not made for such task.
You were talking about your M1 Max not M2 Ultra, that's where 400GB/s came from.
I wasn't talking about Nvidia's "gaming" GPU's in the first place, not sure how you got that impression in a article about AI Features.
Hopefully this rumor is just one of those typical random BS that Gurman throws but we're almost halfway through 2024 and Siri can barely set a timer without saying "Here’s what I found on the web" .
 
3090/4090 are gaming GPU's, designed for gaming not AI tasks unlike Hopper. it's like buying GTX 1660Ti and then complain that it can't run ray tracing, because it's simply not made for such task.
You were talking about your M1 Max not M2 Ultra, that's where 400GB/s came from.
I wasn't talking about Nvidia's "gaming" GPU's in the first place, not sure how you got that impression in a article about AI Features.
Hopefully this rumor is just one of those typical random BS that Gurman throws but we're almost halfway through 2024 and Siri can barely set a timer without saying "Here’s what I found on the web" .
You are all over the place lol. Were you comparing M1 Max with H100 or H200? I bought up ultra because of the server conversation. 4090 is a fair comparison for a prosumer workstation, with Max Max or even an Ultra. Either ways, Siri isn’t Apple AI/ML, I couldn’t care less if they can Siri tomorrow. AI at Apple is lot more than Siri, in fact you don’t really hear much about Siri in Apple AI literature/github or tech documentation.
 
  • Like
Reactions: blob.DK
why pay nvidia 80% margin if you can use your own hardware?

you are also missing the obvious: the M2 Ultra is a niche product, and bigger chips (the so-called M1/M2 Extreme) were cancelled because of that.
now Apple has an incentive to built those as not only they can sell it in Mac Pro/Studio, they can run their own servers on it with massive cost saving compared to what AMD and Nvidia offers.

big chips were given green light if this rumor is true
Apple does not have any chips for servers. Neither M2 Ultra nor M4 have the I/O and RAM memory scaled properly for the servers. Another problem is that nobody develops any AI for Apple chips. It's mostly NVIDIA. I would understand if Apple were developing the servers to deploy in the cloud for third party software developers to develop AI solutions for Apple hardware but generic AI on Apple chips? That sounds... strange.
 
I don't see how any chip could "inherently protect user privacy" -- that is just nonsense. Privacy is primarily a function of the software running on the chip. While there may be features of a chip that could help protect privacy in a multi-tenant environment (like a cloud server), that would be at a very low level such as protecting memory from being read across processes or threads.
Patent double-talk bs from Apple.
 
Last edited:
Yup — I think that’s huge. The marginal cost to Apple of another M2 Ultra is just the fab cost they pay to TSMC. I could imagine it would be ten times more expensive to get the same computational power from Nvidia.

I also presume Apple knows very well how to optimize software to their own hardware. So they will have highly optimized software running on much cheaper hardware and the scale support all that effort
But WHO is making the software?

Is it Apple AI?

Or will they partner with a 3rd party to run their software on Apple hardware?
 
So, all these “on-device processing” rumors were just rumors I guess. I mean, I would have actually liked if iPhone could autonomously think out-of-the-box without requiring any Internet connection, at least for most basic stuff or if I could download certain “memory chunks” that I would need all the time.

I imagine some work can be done on-device in the background, like summarizing messages, emails and the like. Stuff like Siri, where there's a user waiting for a response, will need to go to a server to be processed faster.
 
You can use it to assist you, but if you don't know what you're doing and AI outputs garbage, you will think the garbage is good and consume it. In other words, even though AI may be capable of doing things you cannot, unless you know what the expected output is, you may end up doing things completely wrong...which is worse than not doing them at all.

Right. Generative AI is just a tool for making whatever it is you're making. Like a hammer for nails. If you don't know what you're doing in either instance you stand to harm yourself.
 
You are all over the place lol. Were you comparing M1 Max with H100 or H200? I bought up ultra because of the server conversation. 4090 is a fair comparison for a prosumer workstation, with Max Max or even an Ultra. Either ways, Siri isn’t Apple AI/ML, I couldn’t care less if they can Siri tomorrow. AI at Apple is lot more than Siri, in fact you don’t really hear much about Siri in Apple AI literature/github or tech documentation.
You are still talking about a gaming GPU and comparing it to your basement workstation, guess who is all over the place.
Apple is trying to compete in AI with H200's and soon B200's with their outdated chips (at least according to this article).
Siri is just a living corpse, ofc we don't hear about it.
 
I was talking about their h200 and soon b200 series not gaming GPU's lol.
h200 VRAM: Up To 141 GB HBM3e @ 6.5 Gbps
And that's called unified ram, not VRAM specifically where you have to waste memory on everything which maxes out at 400 GB/s BW. I'm pretty sure that's even slower than RTX 3090's VRAM (as expected ofc).
Apple is currently busy selling their last generation $1800 laptop with 8GB ram, which has less RAM than majority of android phones. I don't think those glorious days will come anytime soon.
Sure those GPU's much more expensive but M2 Ultra is like 31.6 TOPS meanwhile h200 is 3900ish TOPS and soon b200 with 20,000 TOPS.
Comparing M2 Ultra to h200 in AI is more like comparing GT 710 to RTX 4090 in gaming.
What is they can attach many M2 Ultra together. Maybe they can now. How much is a h200 compare to one ultra.
 
Maybe just use Nvidia GPU's that are like 50 times faster and are actually built for this type of workloads ?
Same chip architecture... both Apple and Nvidia chips use ARM base chips designs, apple decided to focus in efficiency and thermal management, so apple is more than capable to design AI chips
 
Probably worth remembering too that Apple has promised to be carbon neutral and rolling out solar power for their operations. Nvidia 4090 or whatever may have more power in certain situations, but if you are paying for all of that in green energy then Apples M chips start looking more attractive. Also helps Apple insulate from the AI gold rush that is coming and will drive other component prices sky high in price.
 
Apple needs to do something with Siri pretty soon. It really is about useless to the point of being a joke with the exception of bungling text messages sent while driving.
 
Seems solid to scale.

  • A14 Bionic (iPad 10): 11 Trillion operations per second (OPS)
  • A15 Bionic (iPhone SE/13/14/14 Plus, iPad mini 6): 15.8 Trillion OPS
  • M2, M2 Pro, M2 Max (iPad Air, Vision Pro, MacBook Air, Mac mini, Mac Studio): 15.8 Trillion OPS
  • A16 Bionic (iPhone 15/15 Plus): 17 Trillion OPS
  • M3, M3 Pro, M3 Max (iMac, MacBook Air, MacBook Pro): 18 Trillion OPS
  • M2 Ultra (Mac Studio, Mac Pro): 31.6 Trillion OPS
  • A17 Pro (iPhone 15 Pro/Pro Max): 35 Trillion OPS
  • M4 (iPad Pro 2024): 38 Trillion OPS
The TOPS benchmark metric isn't particularly useful because there's no standardization as to what constitutes an "operation per second."

The M4 does 38 TOPS using INT8 whereas the M3 does 18 TOPS using INT16, meaning it's analyzing double the data. Cut down to raw firepower the M4 only has about a 5% AI/ML advantage over the M3 when TOPS are equalized.
 
Maybe just use Nvidia GPU's that are like 50 times faster and are actually built for this type of workloads ?
I feel like Apple doing their own thing is better. Everyone is using Nvidia. Some competition is needed as they have. Bit of a stranglehold on the market. Hence their share price.
 
  • Like
Reactions: Smittywerben
I feel like Apple doing their own thing is better. Everyone is using Nvidia. Some competition is needed as they have. Bit of a stranglehold on the market. Hence their share price.
Google is also using their own hardware for Gemini's LLM as far as I know, but they used TPU's, custom-designed for machine learning, not Chromebook hardware.
 
  • Like
Reactions: klasma
It's the next 3D tv, AR/VR, or foldable phone. A dumb trend that no one will care about in a few years.
At this point this must be just trolling. AI from large models will be a component of every job and every creative profession in a few years. It’s like calling statistics a toy and fad.
 
  • Haha
Reactions: rchornef
It seems crazy to use M2 chips to power AI when the M3 and M4, by Apple's own admission, come with a hugely upgraded Neural Engine. What on earth are they thinking?! Maybe there's just a huge surplus of unsold M2 Ultra chips that they need to do something with...
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.