Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Why? It's not too hard. I was looking for a cheaper insurance with ChatGPT earlier. However, I value two benefits that are rarely included. ChatGPT finished the search in three minutes and gave me a nice table with prices and benefits. Last time, the same search took me an entire day.

Did you check all the links and prices were real which ChatGPT will tell you itself isn’t reliable. There’s literally a small text at the bottom of the window telling your to double check the data.

Yeah don’t reply with ‘YES it wAS aLL rEALL’. We know the results can wildly swing.
 
  • Like
Reactions: arkitect
At this point i'm not sure how much involvement Musk has with any of the companies - I doubt he's been at Tesla for months. He appears to be full time into this government thing - whatever it is. He's on Twitter for HOURS a day, i'm running one small business and I don't have time to do that, he's the CEO of what 3, four businesses?

Perhaps there's nine of him? ;)
 
His AIs will soon be able to give you the social security, address, email of any US citizen. You may also ask his AIs how to hack into the IRS, medicare and other US agencies and why not, also ask for the admin password of the Pentagon.

Glad to see everyone cheering for King Musk the first…
 
Except that is not true. I asked ChatGPT a question 2-3 weeks back on when someone first registered as a democrat. The answer I received was wrong. When i pressed harder and asked why did you give me that date it did not know. The discussion then basically came down to the equivalent of a 14 year old who didn't do his homework. It had no answer on why it very confidently gave me the wrong answer. This wasn't the first time this happened either.

None of these LLM's are were created for some objective truth. They're out there to collect dollars from gullible investors.

Think about it. If you built a working model that could solve all of the world's problems why would you announce it to the world? Why would you need investment dollars. If this technology was so awesome why wouldn't the people who put it together use it to produce the cure for cancer. To create a vehicle battery pack that was better for the environment, could power a vehicle for hundreds of miles, and be completely recharged in seconds. To solve world hunger.

This is all nothing more than a modern day carnival show.
Well put, and I agree. If these AI models were so great, they would seek out the objective truth and offer it. That's not the reality. Howver, this would eliminate the human emotional aspect of certain jobs like doctors (it would also eliminate sleep depravation), psychiatrists, judges, etc. I can't wait for AI to replace some and maybe all of those jobs one day. As long as it is 100% accurate of course.
 
  • Like
Reactions: Razorpit
I can’t wait for the AI bubble to burst

They kept hard drives in the sky inflated for about 15+ years. This- as next big thing (whether it is or not)- will probably have AT least as long. Look for the prefixes to show up to keep extending the spin: Hybrid A.I., A.I. cloud, A.I Hybrid Cloud, etc. Where's there a dollar to be harvested, there will be fresh tech spin... and sometimes some actual innovation.
 
Last edited:
After rejecting Elon’s $97.4 billion bid to buy OpenAI, Sam said: “I wish he would just compete by building a better product.”

Well, Elon did just that and smashed it out of the park with Grok 3, which wipes the floor across AI benchmarks. It’s the first model to score over 1400 on Chatbot Arena and has wowed everyone who has tested it. Looks like Sam’s playing catch-up now.
 
Did you check all the links and prices were real which ChatGPT will tell you itself isn’t reliable. There’s literally a small text at the bottom of the window telling your to double check the data.

Yeah don’t reply with ‘YES it wAS aLL rEALL’. We know the results can wildly swing.

Yes of course I checked, it was all accurate and up to date. I find the results to be more and more reliable. In the beginning a lot of answers were just fantasies and made up stuff.

ChatGPT evolved into a faster and more precise Google when searching something complex.
 
After rejecting Elon’s $97.4 billion bid to buy OpenAI, Sam said: “I wish he would just compete by building a better product.”

Well, Elon did just that and smashed it out of the park with Grok 3, which wipes the floor across AI benchmarks. It’s the first model to score over 1400 on Chatbot Arena and has wowed everyone who has tested it. Looks like Sam’s playing catch-up now.

Winning AI benchmarks is like being the skinniest kid at fat camp.
 
Why do we continue to give credit to this guy? If you want to celebrate the team fine, but let's not pretend this dude is in there coding and developing the LLM. He's not designing rockets or building batteries or electric motors. He's a money guy who spews a bunch of vaporware promises to pump his value in the company.

We don't give Warren Buffet direct credit when all the companies he's invested in go up in value. Why are we giving this carnival barker so much credit?

He's like Steve Jobs, but with money, instead of a vision or talent.
 
After rejecting Elon’s $97.4 billion bid to buy OpenAI, Sam said: “I wish he would just compete by building a better product.”

Well, Elon did just that and smashed it out of the park with Grok 3, which wipes the floor across AI benchmarks. It’s the first model to score over 1400 on Chatbot Arena and has wowed everyone who has tested it. Looks like Sam’s playing catch-up now.

The rejection was only days ago. I'm doubting Elon since "did just that and smashed..." This had to be in development for longer than last few days and thus "out of the park" would have been realized whether Sam rolled over and accepted the offer or not... as some accepted deal would not be close to being complete by today too.
 
  • Like
Reactions: rymc02
Winning AI benchmarks is like being the skinniest kid at fat camp.

Model benchmarks are quite silly anyway. Math benches are fine, but most of the other benchmarks are subjective. They aren’t tangible benchmarks like Geekbench measuring CPU performance.

The AI guys even have a name for cheating at benchmarks. If engineers know the tests ahead of time they can tweak the models to perform well on benchmarks. They call this Benchmaxxing.

One super important thing to always remember when it comes to generative AI.

Never trust any demo you see online. They are all carefully cut and edited. Use that stuff yourself and then you’ll understand where they are and what they can and cannot do.

Then you’ll always understand the lies in the media. Journalists are terrible at talking about tech and sometimes that is deliberate.
 
Last edited:
  • Like
Reactions: blob.DK
"maximally truth-seeking AI,"

It feels incredibly snake oily to even suggest something like this

I'm certainly no expert in these (perhaps someone will chime in?), but I thought it was fairly "known" that an issue with LLMs is that they absolutely have no way of knowing on their own or through their own methods of learning, what is factually correct


This part about analyzing "information from X" as a data source for seeking "maximum truth"...

"The new feature is designed to analyze information from the internet and X (Twitter) to provide comprehensive answers to user queries."


Not great

Generally speaking, anyone seeking to suss out factual information usually doesn't speak in terms like "maximal truth".

That verbiage is trying to simplify something that isn't simple or black and white and is far more nuanced and full of subjectivity and required context
 
Last edited:
Wait until the AI is coding its own new models...

Ask any of the top performing models to generate code for an Asteroids clone or Pac Man clone. They came out in 1979 and 1980 respectively. They barely used 50 lines of machine code when they came out.

The best models can barely get Snake right in one shot. They require a lot of hand holding to generate a super basic version of Tetris. They utterly fail at Asteroids or Pac Man.

So despite eating all the data on GitHub, where there are a dozen Asteroids and Pac Man clones, and despite the massive amount of compute power being used, the programming skills of the best models are still utterly sloppy by themselves.

They produce good boiler plate code, because they have been shown tons of boiler plate code, but they cannot program applications by themselves. Even the current wave of ‘reasoning models’ just get stuck in a loop asking dumb questions and failing if they are tasked with creating applications.
 
Except that is not true. I asked ChatGPT a question 2-3 weeks back on when someone first registered as a democrat. The answer I received was wrong. When i pressed harder and asked why did you give me that date it did not know. The discussion then basically came down to the equivalent of a 14 year old who didn't do his homework. It had no answer on why it very confidently gave me the wrong answer. This wasn't the first time this happened either.

None of these LLM's are were created for some objective truth. They're out there to collect dollars from gullible investors.

Think about it. If you built a working model that could solve all of the world's problems why would you announce it to the world? Why would you need investment dollars. If this technology was so awesome why wouldn't the people who put it together use it to produce the cure for cancer. To create a vehicle battery pack that was better for the environment, could power a vehicle for hundreds of miles, and be completely recharged in seconds. To solve world hunger.

This is all nothing more than a modern day carnival show.
I didn’t say whether it was any good!

Open AI has a model that it hopes will provide an objective truth to such extent that it will monetise itself.
If it’s seen to be biased or deliberately misleading it will lose its purpose and go out of business.

This new model isn’t so interested in truth, it’s interested in power. It doesn’t have to make money, it’s buying something, influence.

Its sister social media site didn’t make money, independent sources value it at far below its purchase price. So the question becomes, “What did this otherwise smart man get for his money?”

Power
 
Ask any of the top performing models to generate code for an Asteroids clone or Pac Man clone. They came out in 1979 and 1980 respectively. They barely used 50 lines of machine code when they came out.

The best models can barely get Snake right in one shot. They require a lot of hand holding to generate a super basic version of Tetris. They utterly fail at Asteroids or Pac Man.

So despite eating all the data on GitHub, where there are a dozen Asteroids and Pac Man clones, and despite the massive amount of compute power being used, the programming skills of the best models are still utterly sloppy by themselves.

They produce good boiler plate code, because they have been shown tons of boiler plate code, but they cannot program applications by themselves. Even the current wave of ‘reasoning models’ just get stuck in a loop asking dumb questions and failing if they are tasked with creating applications.

Ahhhhhhhhhh, but identify the finest, most capable programmer in the world. Let's call him/her Casey. Back up time to when Casey was a toddler and ask them to code Pac Man or Asteroids. I would guess Casey might only try to chew the keyboard. But give Casey time to grow knowledge and eventually Casey becomes smartest/best programmer in the world where such a challenge is trivial.

A.I. Casey can learn faster than H.I. (human intelligence) Casey. Thus A.I. Casey can grow up far faster than H.I. Casey. While current A.I. Casey might also try to eat the keyboard, skilled A.I. Casey is only "how fast can it learn to be a great programmer" away. That timetable will be measured in the speed that computer knowledge & skills can accumulate vs. being measured in number of days and years a human can spend in various tiers of school.
 
Last edited:
Deepseek has made all of this irrelevant. Open source in full privacy is where it's at. Mac users can run the 70 billion parameter R1 model at eight bits MLX on a M3 MacBook Pro (128 Gig Ram) entirely off-line. It's the best model I've ever used hands-down. Next generation of the Mac studio ultra and we should be able to run the full R1 at eight bits. Musk who is supposed to be so open source is going to release his outdated version open source at some future date. Hah! Big thanks but no. His commitment to open source is zero. Speak against that guy with your wallet.
 
This looks great. I can't wait to try it when it's not paywalled. I'm glad to see this technology marching forwards.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.