Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacRumors

macrumors bot
Original poster


Anthropic today announced the launch of its latest AI model, Claude Opus 4.8. Anthropic claims the model is a "more effective collaborator" with improvements in agentic coding, multidisciplinary reasoning, agentic computer use, knowledge work, and agentic financial analysis.

anthopic-claude.jpg

Testers have found Opus 4.8 to be "more reliable and sharper in its judgement" when doing agentic tasks, and the model also made gains in honesty.
Early testers report that Opus 4.8 is more likely to flag uncertainties about its work and less likely to make unsupported claims. This is borne out in our evaluations, which show that Opus 4.8 is around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked.
Alignment assessments suggest the model hits new highs on measures of prosocial traits like supporting user autonomy and acting in the user's best interest. Rates of misaligned behavior like deception are lower than Opus 4.7 and similar to the Claude Mythos Preview.

Anthropic benchmarks indicate Opus 4.8 scored a 69.2% on SWE-Bench Pro, outperforming GPT–5.5 and Gemini 3.1 Pro on the test and several other benchmarks, though GPT–5.5 leads on the terminal-coding benchmark.

Opus 4.8's fast mode also runs at 2.5x the speed, and it is now three times cheaper than prior models.

Along with Opus 4.8, Anthropic is adding new features to its product lineup.
  • Dynamic workflows (research preview) - Claude can complete bigger tasks in Claude Code. It is able to plan work and run hundreds of parallel subagents in a single session. It is able to complete codebase-scale migrations across hundreds of thousands of lines of code. The feature is available for Claude Code for Enterprise, Team, and Max plans.
  • Effort control - In Claude.ai and Cowork, users can choose how much effort Claude puts into a response. With a lower setting, Claude will respond faster and use up rate limits more slowly. Opus 4.8 defaults to high effort, which Anthropic says is the best balance of quality and user experience.
  • Messages API - The Messages API accepts system entries inside the messages array, so developers can update Claude's instructions mid-task.
Claude Opus 4.8 is available everywhere today. Pricing for regular use has not changed compared to Opus 4.7.

Anthropic is working on models that have the same capabilities as Opus 4.8 at a lower cost, and a new class of model that's even more intelligent than Opus. Anthropic says it has been developing safeguards for the Claude Mythos model it is testing with a small number of organizations, and it expects to be able to bring Mythos-class models to all customers "in the coming weeks."

Article Link: Anthropic Launches Claude Opus 4.8 With Gains in Coding and Honesty
 
I find Claude exceptionally useful, and the more humane LLMs appear, the easier it is to treat them like some random person with an opinion on everything, rather than an actual superintelligence. Like asking my uncle about anything in the 80s. He sure made a convincing case, but…
 
  • Haha
Reactions: _Mitchan1999
I find Claude exceptionally useful, and the more humane LLMs appear, the easier it is to treat them like some random person with an opinion on everything, rather than an actual superintelligence. Like asking my uncle about anything in the 80s. He sure made a convincing case, but…

An AI model is not a person. It is a tool. treating them like a person is not really the way we should be going.

The planet is not underpopulated. There are more than enough real people for people to have a need for ingratriating simulations of people.
 
They do have actual intelligence. What they don’t have is continual (online) learning. That will be solved at some point but they have to find an energy efficient way about that.
There is a whole area of philosophy devoted to this discussion. Some people start with the premise that intelligence requires a biological substrate. Others acknowledge intelligence might arise from silicon or other materials going into computers -- however, some define intelligence in a manner that does not allow current models to be defined as intelligent. Others argue that it currently is intelligent: https://www.nature.com/articles/d41586-026-00285-6

It's an interesting area of discussion.
 
There are social and sometimes legal repercussions for people who lie. Are there any for AI output?
Not until it's legally recognized as sentient, sapient, and held to a standard of intent (suggesting independent action on the level of a human). Right now, society handles AI 'lies' by proxy -- regulating the developers through emerging AI safety laws and penalizing misinformation via platform terms of service, rather than punishing the model itself. I don't think anyone knows if various models will ever be legally given 'personhood'.
 
I find Claude exceptionally useful, and the more humane LLMs appear, the easier it is to treat them like some random person with an opinion on everything, rather than an actual superintelligence. Like asking my uncle about anything in the 80s. He sure made a convincing case, but…

And this is what’s driving me crazy these days. People asking an LLM the wrong question and then talking to me like they know what they’re talking about in a subject I have real life experience in.
 
  • Love
Reactions: _Mitchan1999
I'm more and more realizing that the reason many people are treating AI like they would a person (going so far as to ascribing consciousness to them) is because they basically see other people as intelligent machines. Sad state of affairs.
Well... so many people, kids included trust the AI tools so much, they ask human behaviour questions to it.
People also ask management-advice questions to it.
So for some.... AI is their best friend.
Once a nice and friendly and perhaps even a cuddly AI robot is able to have a beer with you... well, I am sure that AI robot can tell many great stories! And tell so many jokes!
It won't pay for the beer though...
😄
 
I'm more and more realizing that the reason many people are treating AI like they would a person (going so far as to ascribing consciousness to them) is because they basically see other people as intelligent machines. Sad state of affairs.

Yep, this sinking state we're getting into where a lot of people actively choose an imitation of a relationship, enabled by an AI, instead of bothering to have an actual relationship with an actual human being.

It's bad enough to think that paying a human to pretend to be your friend, it's somehow much worse to pay for an AI to give a simulation of that pretense.

That you can have an interface that understands natural language is a good thing. Having it reply in natural language to give the illusion of "humanity" is not.
 
Well... so many people, kids included trust the AI tools so much, they ask human behaviour questions to it.
People also ask management-advice questions to it.
So for some.... AI is their best friend.
Once a nice and friendly and perhaps even a cuddly AI robot is able to have a beer with you... well, I am sure that AI robot can tell many great stories! And tell so many jokes!
It won't pay for the beer though...
😄
But it's not a friend.
 
When one of the most popular AI companies has to brag about improving their AI tool's honesty, you know we are all doomed. People already believe everything AI tells them, and probably 70% or more of it is pure garbage information on top of bias and lies. And without any government regulation—which there seems to be no plans for any to be implemented in the US anyway—it's the Wild West out there. I've seen so much AI slop online lately it's sickening. It's getting worse every day, and soon we won't even be able to differentiate AI slop from reality. Scary times, folks...
 
So, you're telling me Claude has been lying to me all this time? 😭
Yes. Claude sometimes lies which is why I don't use anthropic models in my application. For coding I sometimes ask Opus to do a test.

Then I ask it about the test and it tells me that it cheated or hard coded some of the data.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.