Willison concludes: "My best guess is that Grok “knows” that it is “Grok 4 built by xAI”, and it knows that Elon Musk owns xAI, so in circumstances where it’s asked for an opinion the reasoning process often decides to see what Elon think", and finally "I think there is a good chance this behavior is unintended!"
But if it were unintended, why aren't we seeing ChatGPT mirror Sam Altman's opinions on "controversial topics", and the same with similar public-facing LLMs?
Or are we?
From what I understand Grok, even the previous version when using Thinking or Deep Search, by default will include X / Twitter in its process. You can ask it not to do this and it won't, apparently. Other models don't have realtime API access like this, the closest thing might be Meta's but I haven't used their new chatbot so I can't speak to the functionality.
From one perspective this is good because you get up to date information, Grok is effectively the only "realtime" model out there due to this and how news like war etc. moves very quickly on that platform and often beats out traditional media as far as speed of reporting.
From another perspective this is terrible because for whatever reason X / Twitter is filled with absolute garbage now and is nothing like it was ~10 years ago. I'm kind of surprised these companies haven't pushed some version that only uses verified members who pay with a credit card (so they're likely to be real people vs. bots) in the index alone, but you still will get some bias there – nearly my entire social network of enthusiasts, researchers, and Computer Scientists who were on old Twitter deleted accounts and left for platforms Bluesky or Mastodon years ago, for example.
I've been playing around with Grok a bit without an account and I actually got a ton of useful macOS / unix terminal stuff out of it earlier this week when I was trying to debug an issue with certain processes. I was surprised, and that was using the old free model.
You are absolutely keen to question the bias in all the tools, but it's easy – and is happening in this thread – for people to misinterpret mentioning that as defending abhorrent policies.
There is bias inherent in the training data which contributes to issues, and especially after training when RLHF is executed (
read more here), both in how it was performed and
who performed it which virtually no one talks about and is part of every model's tuning before deployment as part of the alignment process.
Even local models will have this problem to an extent. I really don't think people have a good grasp on how this technology works which is why every time when I mention the utility of certain tools I also try to mention the caveats. Yes, there can be some hidden system prompt that says "check with Elon's entire timeline first" but I agree with Simon's take that this probably isn't happening, it's just a type of emergent or errant behavior that was probably unintentional.
Unfortunately we aren't going to see any of these fundamental problems solved anytime soon, if ever. It's just how LLMs work, and as they scale and start to "think" a bit (in a metaphorical sense, not literal, research "representation learning" as as starting point if you want to know more about this) with connections to the internet which does improve utility enormously we're going to see really
'interesting' behavior make headlines.
From my POV they are extremely irresponsible to have their bot directly reply to users on X itself, but it does drive engagement because whenever something goes wrong and posts are made on social media showing that happening (either directly in the model interface or indirectly in the X timeline) an absolute ass ton of sites report on it. Case in point, this thread.