Apple Research Questions AI Reasoning Models Just Days Before WWDC

tyranne201 · Jun 9, 2025

this is obvious. AI models just predict the next word basically. there is no "reason" in this.

aUniqueName · Jun 9, 2025

LLM models like ChatGPT are great at certain tasks and I am using it now exclusively instead of the old method of "google searching" for answers on most things. It is able to summarize vast amounts of information in seconds that may have taken me hours to put together before. Examples like "what are the different kinds of magnesium supplements and their uses?" or "what does this windows/Mac error mean and how to fix it?" or it being able to write code for certain tasks - its very good at these kinds of things.

What it can't do (because it isn't true AI) and what we really want it to be able to do is take all of human knowledge and solve problems that are too complex for a human due to the sheer amount of data and math involved. Giving us vast leaps in technology in every area by bringing the theoretical into the real world. Solving problems like curing cancer, developing new kinds of propulsions, creating free energy generation or designing a teleportation device for objects and humans. This type of true AI is still decades off if it is possible at all.

ian-mac · Jun 9, 2025

There is some very clever tech out there undeniably, some in its infancy, some pure gimmicks (Apple, I'm looking at you).

'AI' is a marketing term. Nothing more, nothing less.

A lot of stuff that is branded AI is just clever coding with a huge information store to call on (arguably built largely on plagiarism).

'AI' has become a thing, largely because it's been heavily pushed, along with the lack of actual intelligence knocking about these days and the lazy mentality in people today.

neuropsychguy · Jun 9, 2025

This is important research. It will help identify areas to improve the functioning of the models.

Saying that, the models also sound entirely human. “The researchers' analysis of reasoning traces showed inefficient "overthinking" patterns, where models found correct solutions early but wasted computational budget exploring incorrect alternatives.”

That’s true for people as well. Having considerable experience with some of the tests used in the research, I can say that people will also "waste" computational budget and effort, even after having a correct solution.

This isn't to say the LLM models are reasoning, but we also need to be cautious in saying they are not. They might not reason exactly the same as many people, but it's also possible the reasoning is simply a different form of reasoning. Just like a dog might reason differently than a person, doesn't mean the dog is not reasoning (or fish or snail if you don't like the mammal analogy).

qnssekr · Jun 9, 2025

AI is the biggest scam in tech history

valentinjesse · Jun 9, 2025

While they're questioning AI reasoning models, I've just built a fully functional web app w/ front-end + back-end (which I can also put under a subscription model) running in docker on a VPS, via ChatGPT, in around 100 prompts

I don't need AI to turn me into a "zebra" while it achieves singularity, not yet. I only need it to enhance my productivity, which it does 100x since I don't need a developer to build apps or maintain them for me ($10k-$100k's) and neither to spend months learning to code an app in some novel language or w/ a new library, debug it and lose too much sleep over it. And this is just one use case.

Apple is boring and they're lying a lot. Breakthroughs in tech, like those from OpenAI, Anthropic and others, are a great way to level the playing field and break up trillion-dollar companies that often turn into monopolies.

ian-mac · Jun 9, 2025

valentinjesse said:
While they're questioning AI reasoning models, I've just built a fully functional web app (which I can also put under a subscription model.) running in docker on a VPS via ChatGPT in around 100 prompts

I don't need AI to turn me into a "zebra" while it achieves singularity, not yet. I only need it to enhance my productivity, which it does 100x since I don't need a developer to build apps or maintain them for me ($10k-$100k's) and neither to spend months learning to code an app in some novel language or w/ a new library, debug it and lose too much sleep over it. And this is just one use case.

Apple is boring

View attachment 2517421

Wait until people are saying they don't need your app or subscription model, because.. you know... AI.

GoodWheaties · Jun 9, 2025

This is my shocked face 🙄

Macalway · Jun 9, 2025

Sour grapes. 'Apple's research team'? Shouldn't they be working to improve things instead of attacking successful companies.

And they spent resources to prove something obvious?

And the timing of this. My goodness.

flybass · Jun 9, 2025

valentinjesse said:
While they're questioning AI reasoning models, I've just built a fully functional web app (which I can also put under a subscription model.) running in docker on a VPS via ChatGPT in around 100 prompts

I don't need AI to turn me into a "zebra" while it achieves singularity, not yet. I only need it to enhance my productivity, which it does 100x since I don't need a developer to build apps or maintain them for me ($10k-$100k's) and neither to spend months learning to code an app in some novel language or w/ a new library, debug it and lose too much sleep over it. And this is just one use case.

Apple is boring

View attachment 2517421

This guy gets it. The current state is still incredibly powerful even if it isn’t truly “reasoning”. AI is great at stuff like front end development (which was always mucking around with a bunch of crap languages anyway).

turbineseaplane · Jun 9, 2025

This episode was a really good conversation with Carl Brown (Internet of Bugs on YT) about what AI is, and is not, really good for in Software Dev right now.

The Truth About Software Development with Carl Brown

Podcast Episode · Better Offline · 06/04/2025 · 58m

podcasts.apple.com

TechWhisperer · Jun 9, 2025

Coming from the only company in the field who fails at AI. LMAO!

kolargol99 · Jun 9, 2025

This is very odd move by Apple - "we cannot do LLM so let's discard it". You can check review of this "Apple research" paper and see how deeply it is flawed. Basically this only proves that Apple is the worst among others in AI race and they do not understand a thing...

Bustycat · Jun 9, 2025

qnssekr said:
AI is the biggest scam in tech history

*Apple Intelligence

jaster2 · Jun 9, 2025

Ah. Excellent work. And this helps explain why Siri can't turn on a light today but did yesterday because why?

SoldOnApple · Jun 9, 2025

I love AI. Would be great to tap into our devices for it instead of relying on OpenAI.

flybass · Jun 9, 2025

The illusion of "The Illusion of Thinking"

Very recently (early June 2025), Apple released a paper called The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the…

www.seangoedecke.com

Good post examining one of the apple puzzles and Deepseek’s responses.

CocktailHour · Jun 9, 2025

MacRumors said:
A newly published Apple Machine Learning Research study has challenged the prevailing narrative around AI "reasoning" large-language models like OpenAI's o1 and Claude's thinking variants, revealing fundamental limitations that suggest these systems aren't truly reasoning at all.

For the study, rather than using standard math benchmarks that are prone to data contamination, Apple researchers designed controllable puzzle environments including Tower of Hanoi and River Crossing. This allowed a precise analysis of both the final answers and the internal reasoning traces across varying complexity levels, according to the researchers.

The results are striking, to say the least. All tested reasoning models – including o3-mini, DeepSeek-R1, and Claude 3.7 Sonnet – experienced complete accuracy collapse beyond certain complexity thresholds, and dropped to zero success rates despite having adequate computational resources. Counterintuitively, the models actually reduce their thinking effort as problems become more complex, suggesting fundamental scaling limitations rather than resource constraints.

Perhaps most damning, even when researchers provided complete solution algorithms, the models still failed at the same complexity points. Researchers say this indicates the limitation isn't in problem-solving strategy, but in basic logical step execution.

Models also showed puzzling inconsistencies – succeeding on problems requiring 100+ moves while failing on simpler puzzles needing only 11 moves.

The research highlights three distinct performance regimes: standard models surprisingly outperform reasoning models at low complexity, reasoning models show advantages at medium complexity, and both approaches fail completely at high complexity. The researchers' analysis of reasoning traces showed inefficient "overthinking" patterns, where models found correct solutions early but wasted computational budget exploring incorrect alternatives.

The take-home of Apple's findings is that current "reasoning" models rely on sophisticated pattern matching rather than genuine reasoning capabilities. It suggests that LLMs don't scale reasoning like humans do, overthinking easy problems and thinking less for harder ones.

The timing of the publication is notable, having emerged just days before WWDC 2025, where Apple is expected to limit its focus on AI in favor of new software designs and features, according to Bloomberg.

Article Link: Apple Research Questions AI Reasoning Models Just Days Before WWDC

Amazing that the bosses are convinced that they will replace us all with this.

turbineseaplane · Jun 9, 2025

CocktailHour said:
Amazing that the bosses are convinced that they will replace us all with this.

If they can cut costs and make number go up, that's the end of their concerns.

It's really depressing.

jlc1978 · Jun 9, 2025

Salty Pirate said:
So AI is nothing more than clever programing?

Yea. I remember taking a seminar in pattern recognition many years ago, AI mostly seems a more powerful version of what we ran on big iron.

flybass said:
This isn’t a surprise to users. However, it doesn’t mean the reasoning models aren’t very helpful. They still generate huge productivity gains. There is just a learning curve to feeding and instructing the problem in a digestible way - people advanced in their fields can do this while eliminating the need for junior analysts.

I’ve used it to help with several web tools I’m developing. It’s good a suggesting and explaining code. Although at times the code is right but the logic flow faulty, and a prompt suggesting changed elicits a ‘You’re right …’ response I find humorous. Other times it suggests depreciated functions, probably because they show up more often in its training data and thus appear correct. Google quickly tells me the function is depreciated. AI is good as a support tool but has limitations.

svish · Jun 9, 2025

Interesting read. No doubt all these AI models will definitely be improving at a fast rate and should perform well in the future.

jonnyb098 · Jun 9, 2025

This is heavily ironic given that Apple had verbal diarrhea regarding A.I. last year at WWDC.

turbineseaplane · Jun 9, 2025

jonnyb098 said:
This is heavily ironic given that Apple had verbal diarrhea regarding A.I. last year at WWDC.

Once Apple couldn't make it work, suddenly it's of "questionable value"
😂

novagamer · Jun 9, 2025

Everyone complaining should just read the paper, it isn’t very long, but also read the sources

I’m glad this is getting traction, LLMs and LRMs have some narrow uses but they aren’t generalized and especially aren’t suited for large complex tasks but there is a ton of financial incentive for big tech to make everyone think they are going to get there sooner or later which is not at all a foregone conclusion.

Their entire point is that the path to a generalized tool is unclear, and good on them for saying so. It doesn’t mean that Apple won’t leverage LLM technology.

Model collapse is real and matters, and it’s not a simple problem of.additional training or a larger context window.

Yann is right, we need world models not to bolt on more and more infrastructure to this technology.

Smart people and money are already aware of this, but it will be a long time until upper management and the public understand it due to the extraordinary marketing success and implied usefulness / anthropomorphic perception of this current actually very limited technology.

“Hallucinations” being socialized as the term for zero ground truth and confidence is probably the most genius marketing move of the last 5 yards.

We need things like this to help get the industry back to earth and judge the technology on objective merits. It should never have left research labs, but we are where we are and there’s no putting the cat back into the bag.

Both people who think current technology has zero use (which this paper actually explicitly refutes!) and people who think LLMs or LRMs will inevitably scale to truly high complexity are sadly ignorant.

Timo_Existencia · Jun 9, 2025

It always amazes me to see there are still people claiming the LLMs are just a gimmick. No, they are world changing efficient at many tasks. ChatGPT has completely changed my work flow and eliminated my need for junior employees & outsourcing some key elements in my work.

The question is not whether LLMs are better than experts. It’s whether they are better than the average junior employee or middle manager.

Apple Research Questions AI Reasoning Models Just Days Before WWDC

macrumors 6502

macrumors member

macrumors newbie

macrumors 68040

macrumors member

macrumors newbie

macrumors newbie

macrumors 6502a

macrumors 601

macrumors 6502

Contributor

Suspended

macrumors newbie

macrumors 65816

macrumors regular

macrumors 65816

macrumors 6502

macrumors member

Contributor

macrumors 603

macrumors G5

macrumors 601

Contributor

macrumors 6502

macrumors 68000

Our Staff