The most critical line in this article. The "fees" prompted Apple to shop elsewhere. Thats why we wont be getting great things from Apple anymore.
Maybe they could load OpenAI models on their iPhone making the calculations on the phone an so bypassing the patent, just like with Masismo.
Excellent post!Please don't talk about things you know nothing about.
1. Why is Apple working with Google in this space? Because Apple trains its (MULTIPLE) models on TPUs.
2. Apple has at least two known "sexy" models, along with multiple more specialized models. The two sexy models are called AFM (Apple Foundation Model). They are both *combined* language and vision models.
The on-device model is ~3B parameters, quantized to 2b per parameter.
The server model runs on Apple HW (the GPU of what are presumably M2 or M3 Ultras) and has about 200B models, quantized to about 3.5b per parameter.
Both are competitive (not obviously worse or obviously better) than counterparts *at this size* (which is definitely not the largest size, guesses are eg chatGPT5 is ~8 times larger).
Unusual (or at least non-obvious) aspects of these models include
- combined text and vision rather than two separate models
- multilingual, handling about 15 languages
- somewhat curated training rather than a raw flood of internet text (unclear if this curation helps or hinders performance, but it is there)
- emphasis on "IQ 100" level training not advanced training or reasoning. Apple wants the model to answer sensibly if you want to split tips, but does not [for now...] care how it responds if you give it your calculus homework
3. BY FAR the most important difference of these models, compared, to other models, is that they have been trained to handle a very specific task: a developer can, to simplify immensely, using natural and simple Swift, construct a query to be given to the LLM, and get back a response that is defined in terms of the structs and APIs of the calling app. This is not the same thing as "ask the LLM a question and get back text", and it's not the same thing as " ask the LLM a question and it gives you code that, hopefully, you can compile to do what you want".
The idea is that in a random app, like Uber Eats, I can say "I'd like to order that food we had two weeks ago, it was Asia, Thai I think, but I can't remember the name" and this will result in Uber Eats throwing up a useful order that you can click on. Look at what's involved here: the query has to go into the LLM (so that it can be "understood", Uber Eats also has to provide the database of recent orders (so that the LLM has a clue what was ordered over the relevant time period) and the response can't just be a text string like "looks like you ordered Satay Chicken, and by the way that's Indonesian not Thai", it has to be some sort of structure that plugs into Uber Eats API's to allow the construction of a genuine order.
No-one else has anything like this. THIS is what I mean when I say that Apple is constructing an AI UI, not an AI model.
4. So the position Apple finds itself in is that
- it's trying to figure out how to utilize LLMs AS API.
- this is a research task, so it makes no sense to try to do this research at the same time as you're constantly modifying an LLM that takes $100M per training run! Instead you fiddle around with tiny models, until you think you have the basic concepts required for the APIs (and their pieces within the OS runtime and Swift) all working.
- then you scale this up to a mid-sized model and validate that it still works.
- are they scaling it up to a much larger model? Maybe? Or maybe there is no point in doing that until we get a year or so of experience with this machinery to see what needs to be improved, changed, or discarded?
Apple is not playing the same game as Gemini, OpenAI etc. And it doesn't need to, just like Apple is not trying to compete with Google as a search engine. As long as the leading edge LLMs continue to provide as good an experience on Apple as they do anywhere else, then no-one feels any need to stop buying Apple HW just to get "optimal" ChatGPT.
This is all described in
Aspects of the Swift API were demo'd and examples given at multiple talks at WWDC earlier this year.
Apple did something very stupid last year with the announcement of Apple Intelligence before it was ready and before all this infrastructure was in place. That tells us that Apple Marketing were stupid in this case, and should have followed their normal rule of **** until you're ready to ship. But Apple Marketing is not Apple Engineering, and Apple Engineering have a plan in place grander than anything you can imagine.
This is the same company that acquired Siri 15 years ago, yet they never really pushed it forward in a meaningful way.
GIGO.Here is a screenshot I took a little while ago. I’m not sure whether to laugh cry!
The conflation of Google training a model and Gemini somehow being involved is erroneous. Google’s models are called Gemma, not Gemini. Different things -and only a fine tuned Gemma makes sense here.It’s only been a week and here we are getting the announcement of another red carpet being rolled out … this time for a privacy invader by a privacy protector who wants to win the AI Nobel Price at all cost.
I’m sure Gemini has had no problems of its own. How are those glued pizzas with small rocks?Why would Apple want to integrate a manipulated AI model that heavily leans into conspiracy theories and neonazism? Apple should indeed stay as far away as possible.
Too much R&D dollars spent on refining the poop emoji. True brilliance the way that emoji turned out!Apple Intelligence will never be able to stand on its own, they neglected research for too long and now they have accumulated a technological gap that will never be filled, especially considering the few employees having some knowledge in the topic are being poached by other companies. That being said grok and musk are toxic at this point, Apple should stay clear.
Apple didn’t have cellular modem expertise so they bought the Intel Modem division for $1 billion and hired all of Qualcomm’s top cellular modem engineers. Apple will just buy you up 💰Not that insane. Apple has a long tradition of partnering with other companies. For a long time , Apple relied on Google to develop their iOS map app. Eventually , they ended up making their own map app, but it took them almost a decade.
Before Apple made iCloud, Steve Jobs wanted to buy Dropbox but they refused his offer.
Apple had there focus mainly on Apple Vision Pro and the Apple Car Project. Cook even admitted that the Vision Pro was a Niche product and would not sell a ton in the beginning. They had top secret labs 🥼 on the Vision Pro and should’ve had top secret labs 🧪 🧫 on AI artificial intelligence.How does a company sitting on a trillion dollars of cash fall so far behind in the AI race? Genuine question. Did everything go into that absolute FLOP of a product known as Vision Pro?
in its current state, yes it is. it can't hold a candle to gemini on androidNope.
Dude, screw nazi grok.Does Tim Cook really hate Elon so much that’s he’d rather partner with the tech devil than temporarily lean on Grok until Apple Intelligence can stand on its own? 🤔
You're absolutely right. It's pretty crazy how many people forget that Apple has been at the forefront of Machine Learning for a very long time. I haven't checked it recently, but awhile back, Apple put a lot of early small models on Hugging Face that people could download and try themselves. They're not ignorant of the AI or model training space at all, they've been pioneers in it. They just haven't gotten to a point where they are comfortable shipping what they have, and so are leaning on these third parties in the meantime.Please don't talk about things you know nothing about.
1. Why is Apple working with Google in this space? Because Apple trains its (MULTIPLE) models on TPUs.
2. Apple has at least two known "sexy" models, along with multiple more specialized models. The two sexy models are called AFM (Apple Foundation Model). They are both *combined* language and vision models.
The on-device model is ~3B parameters, quantized to 2b per parameter.
The server model runs on Apple HW (the GPU of what are presumably M2 or M3 Ultras) and has about 200B models, quantized to about 3.5b per parameter.
Both are competitive (not obviously worse or obviously better) than counterparts *at this size* (which is definitely not the largest size, guesses are eg chatGPT5 is ~8 times larger).
Unusual (or at least non-obvious) aspects of these models include
- combined text and vision rather than two separate models
- multilingual, handling about 15 languages
- somewhat curated training rather than a raw flood of internet text (unclear if this curation helps or hinders performance, but it is there)
- emphasis on "IQ 100" level training not advanced training or reasoning. Apple wants the model to answer sensibly if you want to split tips, but does not [for now...] care how it responds if you give it your calculus homework
3. BY FAR the most important difference of these models, compared, to other models, is that they have been trained to handle a very specific task: a developer can, to simplify immensely, using natural and simple Swift, construct a query to be given to the LLM, and get back a response that is defined in terms of the structs and APIs of the calling app. This is not the same thing as "ask the LLM a question and get back text", and it's not the same thing as " ask the LLM a question and it gives you code that, hopefully, you can compile to do what you want".
The idea is that in a random app, like Uber Eats, I can say "I'd like to order that food we had two weeks ago, it was Asia, Thai I think, but I can't remember the name" and this will result in Uber Eats throwing up a useful order that you can click on. Look at what's involved here: the query has to go into the LLM (so that it can be "understood", Uber Eats also has to provide the database of recent orders (so that the LLM has a clue what was ordered over the relevant time period) and the response can't just be a text string like "looks like you ordered Satay Chicken, and by the way that's Indonesian not Thai", it has to be some sort of structure that plugs into Uber Eats API's to allow the construction of a genuine order.
No-one else has anything like this. THIS is what I mean when I say that Apple is constructing an AI UI, not an AI model.
4. So the position Apple finds itself in is that
- it's trying to figure out how to utilize LLMs AS API.
- this is a research task, so it makes no sense to try to do this research at the same time as you're constantly modifying an LLM that takes $100M per training run! Instead you fiddle around with tiny models, until you think you have the basic concepts required for the APIs (and their pieces within the OS runtime and Swift) all working.
- then you scale this up to a mid-sized model and validate that it still works.
- are they scaling it up to a much larger model? Maybe? Or maybe there is no point in doing that until we get a year or so of experience with this machinery to see what needs to be improved, changed, or discarded?
Apple is not playing the same game as Gemini, OpenAI etc. And it doesn't need to, just like Apple is not trying to compete with Google as a search engine. As long as the leading edge LLMs continue to provide as good an experience on Apple as they do anywhere else, then no-one feels any need to stop buying Apple HW just to get "optimal" ChatGPT.
This is all described in
Aspects of the Swift API were demo'd and examples given at multiple talks at WWDC earlier this year.
Apple did something very stupid last year with the announcement of Apple Intelligence before it was ready and before all this infrastructure was in place. That tells us that Apple Marketing were stupid in this case, and should have followed their normal rule of **** until you're ready to ship. But Apple Marketing is not Apple Engineering, and Apple Engineering have a plan in place grander than anything you can imagine.
It definitely shouldn't be ruled out, it's arguably the most advanced model publicly available today and like the others is continually improving. However, it's not really an "Apple" type model. Apple is all about presenting a "safe" and "family friendly" face to things, and a model that's designed to be somewhat less censored than most others (although still heavily censored) like Grok isn't going to be at the top of their list. Apple's not going to be interested in shipping a Siri-like replacement product without extreme censorship, and xAI may not be willing to devote resources to training a special and "safely" censored model just for Apple.Does Tim Cook really hate Elon so much that’s he’d rather partner with the tech devil than temporarily lean on Grok until Apple Intelligence can stand on its own? 🤔
This exactly. I buy Apple stuff in part because I want to avoid Google like the plague.I have VERY mixed feelings about my iPhone having Google rolled in as the "backbone" of anything.
I guess you don’t know that Apple stores LARGE chunks of info on both Google and Amazon cloud system?I have VERY mixed feelings about my iPhone having Google rolled in as the "backbone" of anything.
You’re stretching Siri to it’s limits - well done, Siri super user!Yep - bout all I use Siri for these days. Cooking timers, alarms for work meetings, etc. lol
I wouldn’t be shocked if Apple merges with google in the next ten years, with google as the senior partner in this.I wouldn't be surprised if some government in the future forced apple and other companies to do just that, if apple doesn't do it on their own.