There's been a notable shift towards implementations utilizing a mixture of smaller models to get work done, so it's surprising that Apple feels they need to choose just one model for the entirety of Siri, but I suppose the answer is — fine tuning and consistency. It would be disastrous if the output of Siri gave 10 different results for 10 different people, especially when having direct access to one's device.
Apple has not been using the “one, singular model to rule them all approach at all “ .
Goes back to before WWDC 2025, but more flushed out in accessible documentation then .
“
Introducing Apple’s On-Device and Server Foundation Models “
At the 2024 Worldwide Developers Conference, we introduced Apple Intelligence, a personal intelligence system integrated deeply into iOS 18…
machinelearning.apple.com
( Gemma is Google’s open source derivative of Gemini work )
The Apple Fouundationals models is a plural word. From the joint statement Google and Apple released
“Apple and Google have entered into a multi-year collaboration under which the next generation of Apple Foundation Models will be based on Google's Gemini models and cloud technology. These models will help power future Apple Intelligence features, including a more personalized Siri coming this year. ..
…
Apple Intelligence will continue to run on Apple devices and Private Cloud Compute, while maintaining Apple's industry-leading privacy standards.…”
Apple and Google have entered into a multi-year collaboration under which the next generation of Apple Foundation Models will be based on Google's Gemini models and clou…
blog.google
Again, the use of plural from of ‘model’ .
The API to the models is all wrapped up in an Apple Libaray API. Where your inputs / outputs go is controller by Apple. Whether dispatched local or clould is controlled by Apple. ( similar how call an Apple library for ProRes RAW process and Apple decides if goes to a hardware accelerator or to a 100% software implementation)
If anything Apple has been doing more to construct the smallest possible models, before it got more trendy . Concentrating on smallest is one reason they have tracked the ‘piled higher and deeper’ options that are so big they are mainly mega server only .
( even Private Cloud Compute is not trying to the a “drain the local power grid completely” solution. )
This Gemini solution is ultimately temporary, and it's a shame people are leaving the company.
May not be all that temporary . Some of the more specialized mini-models for task specialized to Apple. But basic speech in/out in a high number of human languages Apple may not touch that over very long term .
A year from now, Siri will be just as good as the best AI tech out today, and two years from now, Siri will be ahead. That's my prediction.
Shades of the 5 nodes in 4 years talk coming form Intel several years ago. For Apple “just as good as” is still a large win for Apple for next 3-4 years.
There is a ton of training data Apple would have to grapple past their privacy and copyright protocols to surpass what the mainstream.
I doubt Apple wants to spend max money on cost sink trying to surpass everyone’s investment there. They would rather have deeper leverage on revenue possible inference .
I hope that Apple is giving serious consideration to physical AI — Robots. That's ultimately where the big future opportunity is.
Not. Apple does not build industrial tools for other peoples factories . It isn’t there core competency . There is a broad set of stuff tha5 will be enabled .
The “Tesla is dumping cars to do robots” is kool-aid that Apple should stay far away from .