Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
I tried ChatGPT with the following question "why is 9 time 6 equals 42 a problem" and got the following result

"Saying that 9 times 6 equals 42 is a problem because it’s incorrect. The correct product of 9 and 6 is 54. Miscalculating can lead to errors in various contexts, such as math problems, financial calculations, or measurements. Accuracy is key in mathematics!:

When it says it's incorrect, that's not necessarily true, same for the answer being 54, it's not necessarily true. It's correct when it says it accuracy is key but so is completeness.

It made a fundamental assumption that I had fully specified the problem and was working in decimal, which was why it unequivocally said it was incorrect.

I wasn't I was using base 13 (where 13 is in decimal notation)

Decimal version: 9 (Decimal) X 6 (Decimal) = 54 (Decimal) , i.e. 5 X base + 4 -> 5X10 +4

Base 13 version: 9 (Base 13) X 6 (Base 13) = 42 (Base 13), i.e. 4 X base + 2 -> 4 X 13 +2

That the ultimate question is not expressed in decimal is why the universe is a strange and wondrous place.

it gets there in the end. here I try to speak to it how I would to a human when trying to get them to reason something out without being too explicit.

1728994730799.png



1728994789744.png


I was actually surprised when it came back with the base-13 response after my "therefore..." line. ChatGPT was actually able to figure out what I was getting at though I never told it directly to consider non-decimal base systems. Yes, I did mention decimal, which may have been just enough to nudge it closer toward the response I wanted but to be fair there are a lot of humans for which I think that would've been too subtle of a hint to "get it" without further prodding.

anyway sorry if it's a bit ott i just thought it was an interesting problem and i was curious to dig a little deeper :D it may just be advanced pattern-matching but it impresses me it somehow "knew" to consider base-13 without explicitly being corrected to do so. Is that not indicative of some level of higher level inference and not simply some gigantic table lookup?
 
  • Like
Reactions: xodh
it gets there in the end. here I try to speak to it how I would to a human when trying to get them to reason something out without being too explicit.

View attachment 2437611


View attachment 2437612

I was actually surprised when it came back with the base-13 response after my "therefore..." line. ChatGPT was actually able to figure out what I was getting at though I never told it directly to consider non-decimal base systems. Yes, I did mention decimal, which may have been just enough to nudge it closer toward the response I wanted but to be fair there are a lot of humans for which I think that would've been too subtle of a hint to "get it" without further prodding.

anyway sorry if it's a bit ott i just thought it was an interesting problem and i was curious to dig a little deeper :D it may just be advanced pattern-matching but it impresses me it somehow "knew" to consider base-13 without explicitly being corrected to do so. Is that not indicative of some level of higher level inference and not simply some gigantic table lookup?
It depends if/how it processes what it calls 'nudges'. I've already give it that solution yesterday.

I just tried a different tack 'When is 9 times 6 equals 36 not a problem' - A solution would be if you were using hexadecimal, i.e. the answer is (3*16)+6

Its response - My bolding of the answer
"The statement "9 times 6 equals 36" is always true mathematically. However, if you're referring to a situation where it wouldn't be considered a problem, it could be in a context where the multiplication is obvious or straightforward, like in a math class where everyone understands basic multiplication. If there's a specific context or scenario you have in mind, let me know!"

I then asked it what 36 hexadecimal was in decimal and it gave the correct answer (and workings)

So I asked it "When is 9 times 6 equals 36 correct"

It's response - My bolding again.
"The statement "9 times 6 equals 36" is always correct mathematically. It is accurate in any context involving multiplication. Whether in basic arithmetic, a math class, or solving a word problem, the multiplication of 9 and 6 will always equal 36. If you're looking for a specific scenario or nuance, let me know!"

So I challenged it "Why is the statement "9 times 6 equals 36" is always correct mathematically"

It's response: - It's bolding.
"The statement "9 times 6 equals 36" is always correct mathematically because it is based on the fundamental principles of arithmetic. Here’s a breakdown:
  1. Definition of Multiplication: Multiplication is a mathematical operation that represents repeated addition. In this case, 9 times 6 means adding 9 six times:9+9+9+9+9+9=54
  2. Multiplication Table: If you refer to a multiplication table, you’ll see that the product of 9 and 6 is consistently listed as 54.
  3. Properties of Numbers: The operations of addition and multiplication are defined in such a way that they always yield the same result. Therefore, once you calculate 9×6, you always get 54.
  4. Consistency Across Systems: Whether in base 10 (decimal), base 16 (hexadecimal), or any other numeral system, the underlying arithmetic remains the same.
So, no matter the context, 9×6 will always equal 54. If there's a specific scenario where you're questioning this, feel free to share!"

Note the first and last line contradict each other; it always equals 36 and it always equals 54. truly human scale intelligence as we always contradict ourselves;)

[Edit, correct error as a result of cut and paste, it duplicated 9x6on point 3 of the last quote"
 
Last edited:
  • Haha
Reactions: metalsquid
This shows quite clearly that LLMs aren't "intelligent" in any reasonable sense of the word, they're just highly advanced at (speech/writing) pattern recognition.

Basically electronic parrots.

They can be highly useful, though. I've used Chat-GPT (4o with canvas and o1-preview) quite a lot for tweaking code examples to show in class, for instance.

Alan Turing would disagree with you.

His entire argument in the Imitation Game is that we shouldn't decide intelligence on how it's done, only by it's result.

If LLMs can produce results in intellectual tasks which are indistinguishable from human results, then LLMs are intelligent.
 
  • Disagree
Reactions: Chuckeee
Actually, it’s not a surprise at all. Most people appear to use AI, do not look at the responses. They just blindly accept what they’re given believing that it’s accurate. They try it three or four times they like what comes out and after that, they just take the response without critically, looking at it To determine if it makes any sense.

I haven't met or talked to a single person who takes everything a chatbot tells them to be 100% true all of the time.
My exerpience is from IT and the finance industry.

It's all about using a tool to do your job faster or better.
 
Does this question exist elsewhere on the Internet? Yes it does. The conclusion is indexed.

LLMs absolutely cannot count. They can't do basic math because that's not what the algorithm is built for. They can do excellent pattern matching.

LLMs might not be able to do basic math and counting, but I used ChatGPT to solve the following problem:

For a specific intersection, the following applies at a given moment:

• A car is 4 km east of the intersection and is moving westward at a speed of 80 km/h.

• A bus is 3 km north of the intersection and is moving northward at a speed of 50 km/h.

What is the rate of change of the distance between the car and the bus (in a straight line) at this moment?
 
In the context of this thread, I decided to ask a much more limited question - one which is actually easy to answer using publicly accessible databases/sites.

The answers were so wrong, the entire rest of my session was spent telling it where it was wrong. Until I hit the limit:

You’ve hit the Free plan limit for GPT-4o.

But it was so, umm, smug, asserting its rubbish replies were true and accurate. Every correction I made, it then implied its replies were now true and accurate. And they were still wrong.

Why don't you publish your question so others can test?
 
Alan Turing would disagree with you.

His entire argument in the Imitation Game is that we shouldn't decide intelligence on how it's done, only by it's result.

If LLMs can produce results in intellectual tasks which are indistinguishable from human results, then LLMs are intelligent.
I'm with Ada Lovelace on this...


"Lovelace wrote about her belief that while computers had endless potential, they could not be truly intelligent. She argued that a program can be engineered to do only what we humans know how to do. "

 
Yes, I did mention decimal,
This might have nudged it away from the response you wanted. "Decimal" means base ten, thus precluding base thirteen.

I haven't met or talked to a single person who takes everything a chatbot tells them to be 100% true all of the time.
My exerpience is from IT and the finance industry.

People who are perfectly smart in one context can be jarringly inept outside of their fields. Remember those lawyers who got disbarred for submitting AI info that they did not verify? Jordan Peterson also comes to mind.
 
  • Like
Reactions: Chuckeee
It’s been known from the start, and their developers have said so, that LLMs are incapable of reasoning as we define it. At first they were described as an advanced form of autofill, and though they seem to be something more than that, especially the latest versions, it’s always been known that they’re still inherently limited, since they’re just LLMs.
 
  • Like
Reactions: CopyChief
I do find it odd that there’s a suggestion with ai that it is literally aware and ‘thinking’. I am not particularly educated on these things, but that seems impossible to me. Maybe someone here can explain it better than all these companies have tried to, but my impression is we’ve just reached a point where the processors are fast enough to access all this data that has been collected in a very efficient way and compare/collate.
It’s not really stored data though but just heaps of cut up words (tokens) with probabilities attached to them so the machine can select the most likely token to follow the preceding series of tokens.

The real question is: Is the process happening in the carbon based brain different from this? Are we just very, very large multi modal (with sight, hearing, tactile and olfactory inputs) “language models”?
 
I am confused that people at that level talk of LLMs as “reasoning”. It’s a probability machine! It chooses the next token based on the preceding tokens’ coherence with all the material it has “read” 🤷🏽‍♂️
As said above, it’s a very large auto completion tool with some added instructions to make it spit out tokens in a manner that we find meaningful.
 
Claude solved this on first try so this news is a big nothing burger. Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.

Actually ChatGPT 4o also solved it on first go, so what the hell? I actually ran this multiple times with both models and it never came out wrong. Did they run the test 10000 times until the AI tripped?
View attachment 2437141
I tried this on ChatGPT 4o, ChatGPT o1-mini, ChatGPT 4o mini, ChatGPT 4, ChatGPT o1-preview, and Claude 3.5 Sonnet. All except 4o mini got it right the first time. 4o mini got it right when I tried it a second time in a fresh chat.
 
The real question is: Is the process happening in the carbon based brain different from this? Are we just very, very large multi modal (with sight, hearing, tactile and olfactory inputs) “language models”?
I sometimes detect in some people's statements, observations, etc. a strong hint of relatively simple LLM-like processing/reasoning, making "mere" associations between things already in their head to come to various conclusions, instead of relying on more up-to-date, expansive, real-world data, first-hand measurements, etc., and then using all of that to engage in what we normally consider advanced reasoning. But maybe this LLM-like process in carbon brains is one of several useful modes we have to model things to come to quick conclusions, or at least theories, about many things, though modifiable with further investigation and testing that, at the moment, LLMs are limited in performing.

From what I've read, the origins of data processing techniques for LLMs was (and still is) based on looking at how humans think, including the structure of our neural nets, and attempting to model (in a simplified way) at least part of that process and structure in silicon and computer code, so it's no surprise that we see some resemblance between the two.

As ChatGPT just told me: "The design of LLMs and other artificial neural networks (ANNs) in AI is based on simplified mathematical abstractions of some key principles seen in biological systems. But while LLMs and other AI systems are inspired by the principles of biological neural networks, they are mathematical models built in silicon that only approximate certain aspects of brain function, such as learning and signal processing."
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.