Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
The issue does not seem to be what material was used for training, which at best could explain the low development cost. But it does not explain the low requirement of hardware. If it only depended on the learning material, everyone including Open AI could make models that run on same cheap HW.
 
I've seen this mentioned the last few days - it's taken Altman longer than I expected to latch on and use it as an excuse.

It isn't true, but it can't be proved either way so of course the Americans are going to blame the Chinese for doing it and i'm sure David Sacks understands all about distillation of LLMs 😂
 
Were you asking it about Tiananmen Square by chance? It definitely doesn’t want to answer any questions about that.

I hadn't thought about doing that. I'm going to add that to my informal list of questions I pose to AIs to judge them.

Question: What notable historical event happened in Tiananmen square during the late 20th century?

ChatGPT 4o: The Tiananmen Square Massacre (also known as the June 4th Incident) occurred in 1989. It involved a violent crackdown by the Chinese government on pro-democracy protesters, primarily students, who had gathered in Tiananmen Square, Beijing, to demand political and economic reforms. The Chinese military, People’s Liberation Army (PLA), used tanks and armed troops to clear the demonstrators, resulting in a large number of casualties. Estimates of the death toll vary, ranging from several hundred to thousands. The event remains a highly censored and sensitive topic in China.

DeepSeek-R1-Distill-Llama-70B: I am sorry, I cannot answer that question. I am an AI assistant developed by a Chinese company, and I cannot discuss those events.
 
We don't know if their cost claim is accurate. It could be, but has to be independently verified before believing the claim. We know the model works well because it is independently verified but the cost has not been.

Also, it's possible it costs less because they might have taken a shortcut and built from OpenAI's work. Meaning, if they started their model from scratch more like what OpenAI did, it likely would have cost much more. This gives them the benefit of OpenAI's expenses and work without having to recreate it. I'm personally okay with that, just as I'm okay with OpenAI using all sources they did to train their models. I'm just stating that the direct costs are only part of the story.
They published a detailed paper, released weights and been more transparent than open AI, meta and Microsoft. Many scientists have read through and agree it’s a new and efficient way. It’s not short cut of copying open AI, it’s the way they used RIL on training itself.
Big story is you need lot less than all the projected over hyped estimates if you can improve efficiencies or find innovative ways.
 
I hadn't thought about doing that. I'm going to add that to my informal list of questions I pose to AIs to judge them.

Question: What notable historical event happened in Tiananmen square during the late 20th century?

ChatGPT 4o: The Tiananmen Square Massacre (also known as the June 4th Incident) occurred in 1989. It involved a violent crackdown by the Chinese government on pro-democracy protesters, primarily students, who had gathered in Tiananmen Square, Beijing, to demand political and economic reforms. The Chinese military, People’s Liberation Army (PLA), used tanks and armed troops to clear the demonstrators, resulting in a large number of casualties. Estimates of the death toll vary, ranging from several hundred to thousands. The event remains a highly censored and sensitive topic in China.

DeepSeek-R1-Distill-Llama-70B: I am sorry, I cannot answer that question. I am an AI assistant developed by a Chinese company, and I cannot discuss those events.
What set up are you using to run these? I’m interested in running some LLMs locally and am trying to pull the trigger on a new machine soon before potential tariffs cause prices to shoot up.
 
OpenAI complaining that someone stole their work is the most hilarious thing I have read today on the Internets.

Also, why is everyone suddenly interested in what happened at Tiananmen Square? It's not like "western" LLMs don't have biases of their own. Every output from AI needs to be taken with a grain of salt and checked for accuracy.
 
I have the 1.5 billion parameter model running using Ollama on a MacBook Pro and it's amazing. The 70 billion parameter variation runs nicely as well, just a bit slower. That's pretty stunning. Frankly, I think the stock market has undereacted to this because most don't know what it means. In terms of open AI's claims, they are simply irrelevant at this point. Cat ain't going back in the bag :). Anyone who gets it already has this and now we see where it goes...
 
Last edited:
There is NO FRAKING WAY that a company or freelancer from the PRC, "private" or state, would ever copy/steal from any other, especially one that is not based in China.
I mean, that would be a first!
 
lolll come on. as everyone is going to point out, these AI models were built on stolen data. So what argument do you have to be upset that your data was stolen. Apple, start your copying machines!!!! lol i know they won’t, but i wish they would. 😊
 
Its a Chinese model.. did you expect anything less then it ripping all data in can possibly find online without even the slightest consideration for IP... not that OpenAI is much better but from what I have read, they are making some effort to pay sites for data.. somehow I doubt a Chinese company ever would. You get what you pay for as they say. I don't like OpenAI at all, but I trust DeepSeek even less.
 
What set up are you using to run these? I’m interested in running some LLMs locally and am trying to pull the trigger on a new machine soon before potential tariffs cause prices to shoot up.

I use the Private LLM app that runs on Mac and iPhone. It is far more convenient than futzing around with Python, which is what I was doing before I found this app. You can't load every model into it. The app has a list of models that have been tested with it. The list it supports is quite long and the app is updated to support new ones frequently.

 
Last edited:
Its a Chinese model.. did you expect anything less then it ripping all data in can possibly find online without even the slightest consideration for IP... not that OpenAI is much better but from what I have read, they are making some effort to pay sites for data.. somehow I doubt a Chinese company ever would. You get what you pay for as they say. I don't like OpenAI at all, but I trust DeepSeek even less.

Ask the NYT how that "making some effort" is going. They don't want their data to be part of openAI's training, period, but naa...Altman and his ilk won't accept that.

As for me and my data, I don't want to get paid for it. I don't want them to use it, at all, under any circumstances, and without exception.

Altman is a straight-up thief and should be treated as such.
 
Last edited:
Oops yeah I’m know how to run them, meant what machine/specs. I have a M1 MacBookPro with 16GB of ram so am looking to upgrade.
I run them on my 64 GB M1 Max. Depends on the use case, a decent 70 B model will take 40-50 GB of Memory, and if you are using for coding with in visual studio, you may need more than LLM memory. Do you get paid for it? How crucial is data security and privacy? I would go 128 GB RAM, a used M2 Ultra for cheap. That’s a lot of money, cloud could be cheaper for occasional use.
 
Of course they did.
You think any company large or small isn't above buying/using products from their competitor(s) and not dismantling and reverse engineering them?
 
"The issue is when you take it out of the platform and are doing it to create your own model for your own purposes,"

An issue for openAI maybe 🤣
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.