What I find a bit baffling is why, instead of risking their billions on AGI, these companies aren't working harder on more focused LLM-supported products and workflows. For example, rather than trying to build another version of a general model that uses Chain-of-Thought (CoT) behind the scenes (GPT-o1), why not save a buh-zillion dollars and build a UX workflow that guides the user toward CoT-based interactions, using the current/existing model? Taking this approach could give them a range of useful and reliable, revenue-generating products, without incurring huge additional costs. I suppose it's FIMO, or hubris, or both...One of a zillions reasons most of us see AI as "the next crypto"
Just being pushed on everyone as SV and the VCs are incredibly desperate for a "next big thing"
Apple's justification for being behind? "Look! See! AI isn't ready yet!"Why has no one else reported this? It took the “newcomer” Apple to figure it out and to tell the truth?
In the context of this thread, I decided to ask a much more limited question - one which is actually easy to answer using publicly accessible databases/sites.I've tried ChatGPT for a few purposes. And, as a super-aide mémoire, it can be quite useful. But it comes up with utter garbage a lot of the time.
I wanted to know about a specific medicine. It came back with companies that no longer exist. One company changed its name seven years ago. And the product they supposedly make was discontinued around twenty years ago. They miss some huge names and make big mistakes.
And a follow-up question came with a couple of references (when I pushed for them). But the articles referenced do not exist - even if you go to the journal's own website, through PubMed, etc., etc. The authors exist. The paper referenced doesn't.
One of the reasons I decided to try it is that working through all the websites is so tedious, so many have paywalls and other barriers to access, that it can takes days to check things manually. I thought it might help with some groundwork.
But I've ended up being quite sure I cannot take a single ChatGPT reply as being "true" or correct or accurate. It might tell me about something but it can take much effort to check...
Reminds me of Brandolini's law:
Brandolini's law - Wikipedia
en.m.wikipedia.org
Claude solved this on first try so this news is a big nothing burger. Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.
In the very early days of computers, there was a program called "Eliza," that pretended to be a psychologist. It was not AI even remotely, but if you answered the questions, it made it seem that way because it would parrot your answers back to you. "I'm angry," Well, how does it make you feel that you're angry?The writer seemed convinced that the AI was obsessing over him and actually asking him to leave his wife. The actual transcript for anyone who's seen this stuff back through the decades, showed the AI program bouncing off programmed parameters and being pushed by the writer into shallow territory where it lacked sufficient data to create logical interactions. The writer and most people reading it, however, thought the AI was being borderline sentient.
But it was so, umm, smug, asserting its rubbish replies were true and accurate. Every correction I made, it then implied its replies were now true and accurate. And they were still wrong.
EXACTLY. They don't actually know anything. They don't understand it the way they do. They can't handle nuance or feeling, because they don't feel. It's all a computation. Any human qualities like confidence, well, that's just us humans projecting those feelings onto them, or reading between the lines. With AI and LLMs, there's nothing between the lines.I think this is one of the key aspect to keep in mind when dealing with current AIs: they tend to always give an answer with what can be perceived as "confidence", whereas in reality they have no actual clue about what they are replying.
And I can certainly believe that someone without my bolshiness, and my confidence in my knowledge of this tiny area, could easily have believed it.I think this is one of the key aspect to keep in mind when dealing with current AIs: they tend to always give an answer with what can be perceived as "confidence", whereas in reality they have no actual clue about what they are replying.
Why has no one else reported this? It took the “newcomer” Apple to figure it out and to tell the truth?
Funny how 'the majority of models fail to ignore these statements' is just one observation out of hundreds in the study. Yet you cherry-picked that one to attribute bad intentions to Apple. Smells of bias.Funny that the study says "the majority of models fail to ignore these statements". So there were models that worked fine but they only cherry-picked the worst ones? Smells of bias.
A system that has a high level of intelligence will
a) know what it doesn't know
b) not attempt to answer a question in those domains.
Apple's justification for being behind?