ChatGPT was outright lieing to me so many times it’s bonkers.
How were they lying to you?
ChatGPT was outright lieing to me so many times it’s bonkers.
Some assumptions have to be made. There is a duty on the questioner.I tried ChatGPT with the following question "why is 9 time 6 equals 42 a problem" and got the following result
"Saying that 9 times 6 equals 42 is a problem because it’s incorrect. The correct product of 9 and 6 is 54. Miscalculating can lead to errors in various contexts, such as math problems, financial calculations, or measurements. Accuracy is key in mathematics!:
When it says it's incorrect, that's not necessarily true, same for the answer being 54, it's not necessarily true. It's correct when it says it accuracy is key but so is completeness.
It made a fundamental assumption that I had fully specified the problem and was working in decimal, which was why it unequivocally said it was incorrect.
I wasn't I was using base 13 (where 13 is in decimal notation)
Decimal version: 9 (Decimal) X 6 (Decimal) = 54 (Decimal) , i.e. 5 X base + 4 -> 5X10 +4
Base 13 version: 9 (Base 13) X 6 (Base 13) = 42 (Base 13), i.e. 4 X base + 2 -> 4 X 13 +2
That the ultimate question is not expressed in decimal is why the universe is a strange and wondrous place.
If you take the time to give me details about the size of some kiwis, I might also discard them from the total. And the reason for this is that I’m reasoning!One example given in the paper involves a simple math problem asking how many kiwis a person collected over several days. When irrelevant details about the size of some kiwis were introduced, models such as OpenAI's o1 and Meta's Llama incorrectly adjusted the final total, despite the extra information having no bearing on the solution.
Now ask it "are you sure about this answer?" and it will likely change it's response. That's what it often does for me.Claude solved this on first try so this news is a big nothing burger. Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.
Actually ChatGPT 4o also solved it on first go, so what the hell? I actually ran this multiple times with both models and it never came out wrong. Did they run the test 10000 times until the AI tripped?
View attachment 2437141
That's exactly what I was referencing. If you've ever played with that program or one similar and you read the transcript in the NY Times article from a couple of years ago, you'd see the latter as an updated version of the former. The only ghost in the machine is the person who wrote the program that sets the bot's parameters.In the very early days of computers, there was a program called "Eliza," that pretended to be a psychologist. It was not AI even remotely, but if you answered the questions, it made it seem that way because it would parrot your answers back to you. "I'm angry," Well, how does it make you feel that you're angry?
AI has kicked this up to the next level, but it's still literally just delivering programmed responses, even if the responses are based on exponentially more data and "learning."
Maybe they DID run the test 10000 times and now it knows the answer ?Claude solved this on first try so this news is a big nothing burger. Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.
Actually ChatGPT 4o also solved it on first go, so what the hell? I actually ran this multiple times with both models and it never came out wrong. Did they run the test 10000 times until the AI tripped?
View attachment 2437141
Claude solved this on first try so this news is a big nothing burger. Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.
Actually ChatGPT 4o also solved it on first go, so what the hell? I actually ran this multiple times with both models and it never came out wrong. Did they run the test 10000 times until the AI tripped?
View attachment 2437141
Lying to you? No it’s not. That suggests intent. It has no intent other than to respond to your prompt. It might be wrong. It might hallucinate. It might make connections that aren’t there or conflate two concepts with similar words. But it’s not lying. It can’t.This is not the first time someone noticed something and wrote about it. ChatGPT was outright lieing to me so many times it’s bonkers.
I asked ChatGPT...If you take the time to give me details about the size of some kiwis, I might also discard them from the total. And the reason for this is that I’m reasoning!
I’d be really curious to see what the model would reply if it had been asked why it subtracted them from the total.
I guess that implies something about the true utility of “executive team”executive team
AI does not need to be self aware to jack you up royally! So are we missing the point?
Obviously, in this case ChatGPT used better reasoning than you. By default, everyone uses base 10. If you want to get correct unswer, you need to use correct (properly specified) prompt which you did not.I tried ChatGPT with the following question "why is 9 time 6 equals 42 a problem" and got the following result
"Saying that 9 times 6 equals 42 is a problem because it’s incorrect. The correct product of 9 and 6 is 54. Miscalculating can lead to errors in various contexts, such as math problems, financial calculations, or measurements. Accuracy is key in mathematics!:
When it says it's incorrect, that's not necessarily true, same for the answer being 54, it's not necessarily true. It's correct when it says it accuracy is key but so is completeness.
It made a fundamental assumption that I had fully specified the problem and was working in decimal, which was why it unequivocally said it was incorrect.
I wasn't I was using base 13 (where 13 is in decimal notation)
Decimal version: 9 (Decimal) X 6 (Decimal) = 54 (Decimal) , i.e. 5 X base + 4 -> 5X10 +4
Base 13 version: 9 (Base 13) X 6 (Base 13) = 42 (Base 13), i.e. 4 X base + 2 -> 4 X 13 +2
That the ultimate question is not expressed in decimal is why the universe is a strange and wondrous place.
Obviously, in this case ChatGPT used better reasoning than you. By default, everyone uses base 10. If you want to get correct unswer, you need to use correct (properly specified) prompt which you did not.
Just for money, that’s all. AI has been hyped as the next big thing by tech corporations. We’ve been “educated” that without AI we will fall behind. The push is on for everybody to use AI and it means more profit. Just look at how Apple was vilified and dismissed for not deploying AI immediately with every other tech Tom/Dick/Harry who jumped on the band wagon. The technology industry is always on the hunt for the next big thing to keep the coin rolling in.If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?