I followed some advice from Claude with a legal document, to do things on my own... and ended up hand delivering a notice to the wrong address... Claude didn't actually read a provided document but did a web search instead. So yeah, double check everything, because you can't hold an AI accountable.Working extensively with AI (both Gemini and Chatgpt, both paid), i can tell you both AIs are constantly wrong about technical concepts. You ask the same question to both and get totally different answers. And most of the time it's clear yes/no answers - so little room for interpretation.
In essence it's like going to 2 different Doctors and getting 3 different opinions.
As for ChatGPT 5.2 - at least this morning, i had to disable the thinking model because it kept thinking forever without an answer.
ChatGPT 6 is most definitely already being used internally. We should expect that within a few months.
Working extensively with AI (both Gemini and Chatgpt, both paid), i can tell you both AIs are constantly wrong about technical concepts. You ask the same question to both and get totally different answers. And most of the time it's clear yes/no answers - so little room for interpretation.
In essence it's like going to 2 different Doctors and getting 3 different opinions.
As for ChatGPT 5.2 - at least this morning, i had to disable the thinking model because it kept thinking forever without an answer.
What is even more interesting is that I can’t try Gemini three pro without subscribing. I don’t need proof of concept that AI works, I need proof that the current iteration is able to do the work that I needed to do. If someone knows where I can sample three pro I would appreciate that information. My attempts at using Gemini over the past six months or so have not met my standards. Most depressing is I actually pay monthly for ChatGPT. I’m not trying to mooch off some freebie.It's interesting that they didn't make comparisons to Gemini 3 Pro or whatever the current Claude Opus is in the benchmarks they provide.
What is even more interesting is that I can’t try Gemini three pro without subscribing. I don’t need proof of concept that AI works, I need proof that the current iteration is able to do the work that I needed to do. If someone knows where I can sample three pro I would appreciate that information. My attempts at using Gemini over the past six months or so have not met my standards. Most depressing is I actually pay monthly for ChatGPT. I’m not trying to mooch off some freebie.
Is there an actual need for you to use LLM chatbots for this work or are you volunteering for pain for some reason?I’m sure I’m going to piss a lot of people off. But 5.2 is about the worst version I’ve used. I’m contemplating going back to 5.1. I don’t use it to draw pretty pictures, I use it to help me do deep research and make sense. This morning I asked that a simple question about US Marines being dispatched to Los Angeles. 5.2 told me I was wrong that the Marines had not been sent to Los Angeles and the last time they had been sent to Los Angeles was 1992 during a natural disaster. This of course was not even beginning to be true. I have found all sorts of problems working with it. It has trouble dealing with metaphors in context, it inserts quantification in discussions were no qualification was alluded to. I am really having to adjust my rules to get it to function in an acceptable manner. I’m using it for a couple of significant projects and 5.2 is lacking.