Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
The general model design is fundamentally sound but to get the level of logical reasoning possible in a human being you’d need a much higher resolution pattern recognition and preferably based on much smaller portions of quantized data. That’s generally the distinction…. Not accounting for potential quantum computing that may occur in carbon nano tubules in the brain. For that we simply cannot do with anything less than quantum computing on silicon. Explaining that would require a bit more text. LOL
 
I tried ChatGPT with the following question "why is 9 time 6 equals 42 a problem" and got the following result

"Saying that 9 times 6 equals 42 is a problem because it’s incorrect. The correct product of 9 and 6 is 54. Miscalculating can lead to errors in various contexts, such as math problems, financial calculations, or measurements. Accuracy is key in mathematics!:

When it says it's incorrect, that's not necessarily true, same for the answer being 54, it's not necessarily true. It's correct when it says it accuracy is key but so is completeness.

It made a fundamental assumption that I had fully specified the problem and was working in decimal, which was why it unequivocally said it was incorrect.

I wasn't I was using base 13 (where 13 is in decimal notation)

Decimal version: 9 (Decimal) X 6 (Decimal) = 54 (Decimal) , i.e. 5 X base + 4 -> 5X10 +4

Base 13 version: 9 (Base 13) X 6 (Base 13) = 42 (Base 13), i.e. 4 X base + 2 -> 4 X 13 +2

That the ultimate question is not expressed in decimal is why the universe is a strange and wondrous place.
Some assumptions have to be made. There is a duty on the questioner.

Specifics, Bob.

 
  • Like
Reactions: KeithBN
Turns out emulating the human brain is more complicated than originally thought. Who’d’ve thunk it.
 
One example given in the paper involves a simple math problem asking how many kiwis a person collected over several days. When irrelevant details about the size of some kiwis were introduced, models such as OpenAI's o1 and Meta's Llama incorrectly adjusted the final total, despite the extra information having no bearing on the solution.
If you take the time to give me details about the size of some kiwis, I might also discard them from the total. And the reason for this is that I’m reasoning!

I’d be really curious to see what the model would reply if it had been asked why it subtracted them from the total.
 
Last edited:
Claude solved this on first try so this news is a big nothing burger. Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.

Actually ChatGPT 4o also solved it on first go, so what the hell? I actually ran this multiple times with both models and it never came out wrong. Did they run the test 10000 times until the AI tripped?
View attachment 2437141
Now ask it "are you sure about this answer?" and it will likely change it's response. That's what it often does for me.
 
  • Like
Reactions: KeithBN
In the very early days of computers, there was a program called "Eliza," that pretended to be a psychologist. It was not AI even remotely, but if you answered the questions, it made it seem that way because it would parrot your answers back to you. "I'm angry," Well, how does it make you feel that you're angry?

AI has kicked this up to the next level, but it's still literally just delivering programmed responses, even if the responses are based on exponentially more data and "learning."
That's exactly what I was referencing. If you've ever played with that program or one similar and you read the transcript in the NY Times article from a couple of years ago, you'd see the latter as an updated version of the former. The only ghost in the machine is the person who wrote the program that sets the bot's parameters.

The innovations are the sheer size of the source database and the ability to receive queries and respond to them using realistic language. The rest is a search engine and the use of probability calculations to generate a seemingly plausible response to the query.
 
Claude solved this on first try so this news is a big nothing burger. Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.

Actually ChatGPT 4o also solved it on first go, so what the hell? I actually ran this multiple times with both models and it never came out wrong. Did they run the test 10000 times until the AI tripped?
View attachment 2437141
Maybe they DID run the test 10000 times and now it knows the answer ?
 
Claude solved this on first try so this news is a big nothing burger. Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.

Actually ChatGPT 4o also solved it on first go, so what the hell? I actually ran this multiple times with both models and it never came out wrong. Did they run the test 10000 times until the AI tripped?
View attachment 2437141

I think you are missing the thing here.

The models produced a correct answer through statistical means. They did not solve it.
 
This is not the first time someone noticed something and wrote about it. ChatGPT was outright lieing to me so many times it’s bonkers.
Lying to you? No it’s not. That suggests intent. It has no intent other than to respond to your prompt. It might be wrong. It might hallucinate. It might make connections that aren’t there or conflate two concepts with similar words. But it’s not lying. It can’t.
 
If you take the time to give me details about the size of some kiwis, I might also discard them from the total. And the reason for this is that I’m reasoning!

I’d be really curious to see what the model would reply if it had been asked why it subtracted them from the total.
I asked ChatGPT...

"If an AI model removed the 5 smaller kiwis from the total, it might have misinterpreted the phrase "five of them were a bit smaller than average" as a condition to exclude them. The model could have inferred that since the kiwis were described differently, they should be treated separately or excluded, even though the problem doesn’t explicitly state that the smaller kiwis should be removed from the total count."

Exactly what I could have done!
 
Last edited:
  • Like
Reactions: Algr
I think the best approach to start reasoning about this about is with Arthur C Clarke: "any sufficiently advanced technology is indistinguishable from magic". That is not adequate for explaining human behaviour so I pose an alternative quote: "any poorly educated society is incapable of reasoning about the claims that arrive with technology".

Alas through some irony, people who believe the claims probably can be replaced by this technology.

What a quagmire! Perhaps we should use it to generate content for education so we can improve society to a point where it doesn't think this is credible magic. Oh no, wait, it fails at that too! This fell out the other day...

image.png


The clever people rode NVDA up to the sky, laughing at the investors who are going to pay for their gains later.
 
AI does not need to be self aware to jack you up royally! So are we missing the point?
 
I tried ChatGPT with the following question "why is 9 time 6 equals 42 a problem" and got the following result

"Saying that 9 times 6 equals 42 is a problem because it’s incorrect. The correct product of 9 and 6 is 54. Miscalculating can lead to errors in various contexts, such as math problems, financial calculations, or measurements. Accuracy is key in mathematics!:

When it says it's incorrect, that's not necessarily true, same for the answer being 54, it's not necessarily true. It's correct when it says it accuracy is key but so is completeness.

It made a fundamental assumption that I had fully specified the problem and was working in decimal, which was why it unequivocally said it was incorrect.

I wasn't I was using base 13 (where 13 is in decimal notation)

Decimal version: 9 (Decimal) X 6 (Decimal) = 54 (Decimal) , i.e. 5 X base + 4 -> 5X10 +4

Base 13 version: 9 (Base 13) X 6 (Base 13) = 42 (Base 13), i.e. 4 X base + 2 -> 4 X 13 +2

That the ultimate question is not expressed in decimal is why the universe is a strange and wondrous place.
Obviously, in this case ChatGPT used better reasoning than you. By default, everyone uses base 10. If you want to get correct unswer, you need to use correct (properly specified) prompt which you did not.
 
  • Like
Reactions: KeithBN
OMG, Apple!!! Research? Study?

I would consider this common knowledge. There is no reasoning in LLMs - the latest ChatGpt o1-preview tries to work in this direction. But an LLM constructs sentences from statistical data (very rough explanation). Neither does an AI/LLM understand you, nor does it think about what you said - it just generates text. Btw an LLM doesn‘t wake up in the middle of the night, repeating the questions and answers of the day, asking itself if the given answers were correct.

With the insane amount of training data the LLMs create surprisingly correct output. But depending on the model, no one guarantees that the generated stuff is correct.

Btw - if you ask a human, the output can be garbage as well.
 
Obviously, in this case ChatGPT used better reasoning than you. By default, everyone uses base 10. If you want to get correct unswer, you need to use correct (properly specified) prompt which you did not.

I also asked it 'When does 9x6 = 42' - it's reply started "Nine times six equals fifty-four, not forty-two"

Not a good answer... Correct would be in Base 13

It then referenced the Hitchhikers Guide, so it clearly knows much more about literature than arithmetic...
 
  • Disagree
Reactions: KeithBN
Apple is bringing this to your attention to lower expectations for these models, but to also mention the solution. Neural network integration. Guess who has 2 billion devices powered by neural networks? Hint hint, they will have an event talking about it in June of next year and it ain’t Google.
 
  • Love
Reactions: wilhoitm
I feel like Apple just threw together Apple Intelligence as they needed to say they were competing. iPhones still don’t have the whole feature they were supposedly “designed for Apple Intelligence.” Marketing at its worst.

Reality is Apple should have been on top of AI for years, should have more advanced models rather than outsource the data to ChatGPT. Who actually believes Apple cares about your privacy can realize they don’t when using third parties for screening your calls and AI.

Maybe if they hadn’t been spending Billions on cars and Vision Pros they could actually compete with their own solution.
 
If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Just for money, that’s all. AI has been hyped as the next big thing by tech corporations. We’ve been “educated” that without AI we will fall behind. The push is on for everybody to use AI and it means more profit. Just look at how Apple was vilified and dismissed for not deploying AI immediately with every other tech Tom/Dick/Harry who jumped on the band wagon. The technology industry is always on the hunt for the next big thing to keep the coin rolling in.
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.