Why all this fuss when you can simply use ChatGPT ?
Because Apple Intelligence is totally bespoke, using an on-device and hybrid-cloud model that Apple built. The integrations are designed to serve various purposes pretty much
none of which are being an unreliable, hallucinating chatbot.
The only time ChatGPT is used is if Siri is asked a query it cannot answer. Previously, it'd give you search engine suggestions that'll likely give reasonable results. Now, it will instead - after in theory on the very first attempt, asking you to give it permission - ask ChatGPT to answer the query. Then you have to go to a search engine
anyway because obviously you cannot trust LLM output given the propensity to hallucinate, and chances are, if you're asking a question, it's because you don't know the answer so have no way of knowing if the LLM is telling lies without verifying the output with a search engine anyway.
TL;DR, ChatGPT remains largely useless and unreliable, unless you want to generate bland pseudo-fictional prose that screams "this was written by an LLM because I couldn't be arsed writing it myself".
The “Siri understand context” thing is an absolute joke at best or a lie at worst
I asked Siri where someone in my Find My is and they gave me the answer, I asked “what’s the weather like there?” And they gave me MY weather
I asked “what’s the weather like where <Person> is?” and they gave me my weather again
[...] how can you not even figure out such a simple request??
Well, yes. At what point in any of the non-hype reports about AI performance have you been led to believe that
any of the crapware being sold by these for-profit, self-interested companies is actually reliable? The whole fundamental basis of LLMs is not solid and never will be. ChatGPT 4 (not 4o) is about as good as we'll ever see, because it's been trained on just about all text ever written already. The Apple models are dramatically smaller, and so are guaranteed to be dramatically less accurate.
A standard pre-"LLM GenAI hype era" machine learning model or even just a simple rules engine would do a far better job, as seen by the likes of Alexa's superior-to-Siri performance for a few years
before the LLM hypetrain started rolling.