Please don't talk about things you know nothing about.
1. Why is Apple working with Google in this space? Because Apple trains its (MULTIPLE) models on TPUs.
2. Apple has at least two known "sexy" models, along with multiple more specialized models. The two sexy models are called AFM (Apple Foundation Model). They are both *combined* language and vision models.
The on-device model is ~3B parameters, quantized to 2b per parameter.
The server model runs on Apple HW (the GPU of what are presumably M2 or M3 Ultras) and has about 200B models, quantized to about 3.5b per parameter.
Both are competitive (not obviously worse or obviously better) than counterparts *at this size* (which is definitely not the largest size, guesses are eg chatGPT5 is ~8 times larger).
Unusual (or at least non-obvious) aspects of these models include
- combined text and vision rather than two separate models
- multilingual, handling about 15 languages
- somewhat curated training rather than a raw flood of internet text (unclear if this curation helps or hinders performance, but it is there)
- emphasis on "IQ 100" level training not advanced training or reasoning. Apple wants the model to answer sensibly if you want to split tips, but does not [for now...] care how it responds if you give it your calculus homework
3. BY FAR the most important difference of these models, compared, to other models, is that they have been trained to handle a very specific task: a developer can, to simplify immensely, using natural and simple Swift, construct a query to be given to the LLM, and get back a response that is defined in terms of the structs and APIs of the calling app. This is not the same thing as "ask the LLM a question and get back text", and it's not the same thing as " ask the LLM a question and it gives you code that, hopefully, you can compile to do what you want".
The idea is that in a random app, like Uber Eats, I can say "I'd like to order that food we had two weeks ago, it was Asia, Thai I think, but I can't remember the name" and this will result in Uber Eats throwing up a useful order that you can click on. Look at what's involved here: the query has to go into the LLM (so that it can be "understood", Uber Eats also has to provide the database of recent orders (so that the LLM has a clue what was ordered over the relevant time period) and the response can't just be a text string like "looks like you ordered Satay Chicken, and by the way that's Indonesian not Thai", it has to be some sort of structure that plugs into Uber Eats API's to allow the construction of a genuine order.
No-one else has anything like this. THIS is what I mean when I say that Apple is constructing an AI UI, not an AI model.
4. So the position Apple finds itself in is that
- it's trying to figure out how to utilize LLMs AS API.
- this is a research task, so it makes no sense to try to do this research at the same time as you're constantly modifying an LLM that takes $100M per training run! Instead you fiddle around with tiny models, until you think you have the basic concepts required for the APIs (and their pieces within the OS runtime and Swift) all working.
- then you scale this up to a mid-sized model and validate that it still works.
- are they scaling it up to a much larger model? Maybe? Or maybe there is no point in doing that until we get a year or so of experience with this machinery to see what needs to be improved, changed, or discarded?
Apple is not playing the same game as Gemini, OpenAI etc. And it doesn't need to, just like Apple is not trying to compete with Google as a search engine. As long as the leading edge LLMs continue to provide as good an experience on Apple as they do anywhere else, then no-one feels any need to stop buying Apple HW just to get "optimal" ChatGPT.
This is all described in
Aspects of the Swift API were demo'd and examples given at multiple talks at WWDC earlier this year.
Apple did something very stupid last year with the announcement of Apple Intelligence before it was ready and before all this infrastructure was in place. That tells us that Apple Marketing were stupid in this case, and should have followed their normal rule of **** until you're ready to ship. But Apple Marketing is not Apple Engineering, and Apple Engineering have a plan in place grander than anything you can imagine.