Google Gemini Could Be the Ceiling on Apple's AI Ambitions

Robert.Walter · Jun 8, 2026

I'm not too impressed with Gemini ... or the others ... do anything relatively complex with lots of facts and they give erroneous info like all the time. its amazing how much trouble they have with static facts like history, legal statutes (missing key passages in a statute under discussion), and settled law.

if I hadn't been relatively deep into the details, I'd have been on the wrong track multiple times.

Call gemini out on its serial failures, it will freely admit hallucination, apologize, promise to take care and do better, then make serious mistakes in the next reply or two.

TBH, I'm going to have to go deeper, and then return to first principles and verify all the info i've collected because at this point I'm not sure what's correct, wrong, invented or missing.

mannyvel · Jun 8, 2026

Robert.Walter said:
I'm not too impressed with Gemini ... or the others ... do anything relatively complex with lots of facts and they give erroneous info like all the time. its amazing how much trouble they have with static facts like history, legal statutes (missing key passages in a statute under discussion), and settled law.

I pay for Gemini. It's the only google service I use, and IMO it's the best so far. Not sure how you're prompting, but I xref about half of its answers and they're spot-on. I've only caught it a few times, and in those times it was parroting the conventional wisdom (which happens to be wrong).

In any case it can't be worse than Siri, who can hardly do anything these days.

Hajj.david · Jun 8, 2026

Robert.Walter said:
I'm not too impressed with Gemini ... or the others ... do anything relatively complex with lots of facts and they give erroneous info like all the time. its amazing how much trouble they have with static facts like history, legal statutes (missing key passages in a statute under discussion), and settled law.

if I hadn't been relatively deep into the details, I'd have been on the wrong track multiple times.

Call gemini out on its serial failures, it will freely admit hallucination, apologize, promise to take care and do better, then make serious mistakes in the next reply or two.

TBH, I'm going to have to go deeper, and then return to first principles and verify all the info i've collected because at this point I'm not sure what's correct, wrong, invented or missing.

Use the right tool for the job. There is specific legal AIs that are able to churn through case law and statues. Lexis + Protoge is a big one. Gemeni is garbage for general knowledge, Grok to me is the most accurate for history and scholarly research, Claude best for coding and software development.

McScooby · Jun 8, 2026

I don't get it, I really don't. Gemini's results can be better or worse than others depending on what you do with it. & this isn't like Apple users using Google Maps where Google can get an insight into the data, this is providing Google with a whole roadmap of where Apple products are headed, certainly at least in the interim.

I'd expect Apple to play this safe, use Gemini for additional functionality to placate the markets and users without showing their hand too much about where we're headed. If it's all-in it's a mistake, if it's a placeholder that's ok, but of course this is something we won't know for sure til further down the line.

montuori · Jun 8, 2026

bsolar said:
Not sure which tools you are using and how, but modern AI coding tools can be very powerful. They don't replace human expertise, but they can definitely do a lot, especially when used in a disciplined process.

I could not agree more with this, especially "disciplined process." With the right "team" of agents and sub-agents the quality of work produced is -- dare I say, it being WWDC day after all -- magical. Like any tool it requires practice to use effectively.

My favorite Gemini prompt of late is "You're an antagonist red team QA engineer; please evaluate this PR request." In a fresh agent with no context this always returns something, often very subtle security issues that would probably not have been found by humans (or even fairly rigorous fuzz testing). Perhaps all this is unnecessary if you work in a shop with an actual antagonistic red team (I miss the glory days when this was a thing) but for the rest of us, the coding agents serve as a very useful proxy.

I'm bullish on Apple Intelligence. Two years ago I was less enthused if only because the results of the LLM chugging were not predictable enough to unleash on tasks with real world consequences. A lot has changed since. (Hell, a lot has changed in the last six months: this technology is changing fast.)

Le0M · Jun 8, 2026

Paradoxally said:
The technology is not there yet to allow capable models that run fully on-device for devices like smartphones.

And when the average person's term of comparison is a cloud model (ChatGPT, Gemini, Claude) it doesn't bode well when they realize it's far inferior - as Apple Intelligence is today, and any mildly complex request will get forwarded to ChatGPT anyway.

Sure, I was talking about these things as an objective for Apple. Definitely not happening on any os27

Le0M · Jun 8, 2026

jimbobb24 said:
On device AI is what we have now and it sucks. Dont have the memory bandwidth and processor speed for that to be great. Siri also started off device until tech caught up.

It might take a few years, but it's definitely gonna be possible.

McScooby · Jun 8, 2026

Le0M said:
It might take a few years, but it's definitely gonna be possible.

It's existed way before Siri with Voice Control, before we were told the Phones weren't powerful enough for on-device models. What else is the same as it was 18 years ago?

Brother Cavil · Jun 8, 2026

Wizec said:
Gemini has been hopelessly wrong for me for weeks now. I’ll ask it a question, then press it because it’s wrong and the next response is, “you’re absolutely right, thanks for catching my mistake…” this does not bode well for the short term.

I've had the same issue with Gemini. It has given me a lot of wrong answers and then apologizes profusely. What's the point?

Brother Cavil · Jun 8, 2026

ProbablyDylan said:
They really should've just bought Anthropic before their valuation blew up

This 100%. Being dependent upon a third party for what is, increasingly, becoming an essential technology is problematic on many levels. For a company that is so determined to do everything itself, this is strange. Apple has poured billions into building their own cellular modem, yet they hand Google the keys to the AI kingdom? Seems very shortsighted.

dropadrop · Jun 8, 2026

The whole premis around SaaS companies valuations dropping is that AI will enable new startups to implement equivalent services and take marketshare. That is, AI will enable them to do new things easier then before.

Note, not that google will suddenly be the only company on earth.

Somehow when it comes to Apple, it’s suddenly assumed that this is not the case, and they would need their own frontier model.

I’m assuming that at the end of this business students will be analyzing this and wondering how Apple had the sense to skip this initial arms race.

bsolar · Jun 8, 2026

DavidLeblond said:
We have a Co-pilot subscription so it's GitHub co-pilot plugged into Visual Studio and Visual Studio Code (depending on my use case, I use both.) Not sure the exact model, but I think it's Claude.

Typically that kind of AI is helpful for small tasks only. It can be useful to explain or implement small code snippets, but it will struggle when trying to do more. I even disabled the IDE AI autocompletions as I don't think are that helpful. These are more "give me some quick help while I code" tools than a proper agentic one IMHO.

If you want to try some better agentic tools, I suggest either Claude Code or OpenCode.

Claude Code is kind of the gold standard at the moment but you will need a different subscription. It offers IMHO the currently best LLM coding models and large context windows. At my company it's going to be the tool of choice for AI-driven software development over Copilot.

OpenCode is an open source alternative which can be used with a number of different subscriptions. The advantage is that you can use it with the existing Copilot subscription although it will be either limited to older models or newer models with relatively low request caps as that's what Copilot offers. In general there will be much smaller context windows too. It will not be as good as Claude Code.

The best approach is to open the tool in a project repository and prompt to make a plan to do something, e.g. implement feature x. The tool should analyze the code, probably ask some questions and formulate a plan. Review the plan and if satisfactory ask to implement it. If the plan seems too complex, revise it to be less ambitious. You can even ask the AI for that and it should provide some proposals.

There are some "plugins" that can make it even more structured, e.g. OpenSpec or Superpowers. Once installed in the tool of choice you can do e.g. openspec->propose to make a structured plan and openspec->apply to implement it, or superpowers->brainstorm and superpowers->tdd, if you want a brainstorming first and a test-driven-development implementation. I especially like OpenSpec since proper specification documents are a must for the kind of software I work on.

PS: another alternative if you have the right hardware is to install a local LLM and use it in OpenCode instead of the Copilot LLMs. E.g. ollama is very easy to setup and you can use some quite powerful models if you have enough unified memory on a powerful enough Apple Silicon.

DavidLeblond · Jun 8, 2026

bsolar said:
Typically that kind of AI is helpful for small tasks only. It can be useful to explain or implement small code snippets, but it will struggle when trying to do more. I even disabled the IDE AI autocompletions as I don't think are that helpful. These are more "give me some quick help while I code" tool than a proper agentic one IMHO.

If you want to try some better agentic tools, I suggest either Claude Code or OpenCode.

Claude Code is kind of the gold standard at the moment but you will need a different subscription. It offers IMHO the currently best LLM coding models and large context windows. At my company it's going to be the tool of choice for AI software development over Copilot.

OpenCode is an open source alternative which can be used with a number of different subscriptions. The advantage is that you can use it with the existing Copilot subscription although it will be either limited to older models or newer models with relatively low request caps. In general there will be much smaller context windows too. It will not be as good as Claude Code.

The best approach is to open the tool in a project repository and prompt to make a plan to do something, e.g. implement feature x. The tool should analyze the code, probably ask some questions and formulate a plan. Review the plan and if satisfactory ask to implement it. If the plan seems too complex, revise it to be less ambitious. You can even ask the AI for that and it should provide some proposals.

There are some "plugins" that can make it even more structured, e.g. OpenSpec or Superpowers. Once installed in the tool of choice you can do e.g. openspec->propose to make a structured plan and openspec->apply to implement it, or superpowers->brainstorm and superpowers->tdd, if you want a brainstorming first and a test-driven-development implementation. I especially like OpenSpec since proper specification documents are a must for the kind of software I work on.

PS: another alternative if you have the right hardware is to install a local LLM and use it in OpenCode instead of the Copilot LLMs. E.g. ollama is very easy to setup and you can use some quite powerful models if you have enough unified memory on a powerful enough Apple Silicon.

Yeah, thanks to DOGE my work isn't going to be able to afford anything other than what Microsoft provides.

bsolar · Jun 8, 2026

DavidLeblond said:
Yeah, thanks to DOGE my work isn't going to be able to afford anything other than what Microsoft provides.

You can still do some decent work with OpenCode + the LLMs offered in Copilot. They will be more slow and unable to do more complex, comprehensive tasks, but should still be able to do well if the tasks are not too ambitious.

switz · Jun 8, 2026

My Siri needs did not change with this Apple dog and pony show today. 🙄

Every morning I need a 11 minute 15 second timer to get a signal the coffee is brewed.

Siri is able to do that reliably. 😱

So Siri meets my expectation. 😎😀

atonaldenim · Jun 8, 2026

I'm not much of an AI user, I don't know what Gemini can do on Android, but I'm pretty impressed with what I saw in the keynote today. If my iPhone will be able to do those things, I'll be happy. (after I upgrade to a more powerful phone someday.... make a new Mini, Apple!)

Search

Search

Google Gemini Could Be the Ceiling on Apple's AI Ambitions

Robert.Walter

macrumors 68040

mannyvel

macrumors 68000

Hajj.david

macrumors 6502a

McScooby

macrumors 68000

montuori

macrumors 6502

Le0M

macrumors 65816

Le0M

macrumors 65816

McScooby

macrumors 68000

Brother Cavil

macrumors 6502

Brother Cavil

macrumors 6502

dropadrop

macrumors regular

bsolar

macrumors 68000

DavidLeblond

macrumors 68020

bsolar

macrumors 68000

switz

Contributor

atonaldenim

macrumors 6502

Our Staff