Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
You do realize that all LLMs are trained on mostly the same massive piles of data scraped from the internet with very little regard for copyright, privacy or ethicality, right? Your average spaghetti recipe prompt is of absolutely no value whatsoever to DeepSeek or OpenAI, given the vastness of data that's freely available already. I'd be far more scared of Google, Microsoft and Meta, that actually have your private documents, messages and e-mails at their fingertips, just wondering whether anybody would really notice if they trained a model on that massive, untouched pool of data. Cause, boy, that would make a big difference.

By logging the data for a user it's possible to build a profile, even predict future behaviour, do precise targeting.
 
I saw a fascinating report on this on CNBC. They were able to achieve this with older hardware and 1/1000 the budget we are spending on our closed source (read: to be monetized at every turn) variant. Pretty soon our AI endeavors will resemble our healthcare system.
Many US LLMs are open source. The largest of which....Llama by Facebook.
 
Just because evaluating one single prompt doesn't weigh all that much in terms of environmental impact doesn't mean that the whole operation of offering an inference API is ecologically fine. These models run on very power hungry GPUs, many thousands of them and they are all running 24/7 at close to 100% utilization. It doesn't matter whether you run training code or inference code on them, if they are used, they are used. You can be pretty sure that a model such as GPT4o has burned more energy during inference integrated over its lifetime than the training of the model has.
Additionally—especially running LLMs at home is actually an incredibly wasteful endeavor comparatively. The only reason why inference on a server farm scale is so efficient is because it can run batched, meaning one instance of the model can process multiple prompts at once. At home, you'll likely only ever process a single prompt at once, which is very inefficient energetically.

This can be said for running anything on a server though. Streaming platforms, search engines, cloud gaming (which is much worse)

There is zero indication that for inference GPUs are running at 100% utilisation 24/7 at all.

Your local LLM example doesn't make any sense either. It's irrelevant if you process one or multiple prompts the computer scales. The LLM doesn't eat up energy just sat there in memory doing nothing - and it only uses cycles when it's used - you don't use more energy by only having a single prompt - you use what you use regardless. Playing a single game of Fortnite for 30 minutes is going to use more electricity at home than an entire day using an LLM.
 
You need beefy hardware, but it's not that hard. People are running the 671B model on 32-core EPYC systems with 384 GB of RAM.

It must be extremely slow though - i'd only want to run LLMs in VRAM to be honest - as soon as it goes to system ram the tokens per second crawl to a stop.
 
How do you think they got all the info to train it for next to nothing?! 🤣

The same was everyone else creates an LLM did - all the content on the open web.

An LLM is a read only model, you using it doesn't teach them anything - if you run deepseek as a LLM that's it - it's private in your system only.
 
Try asking for vaccine side effects in ChatGpT and include the test report, chargpt will initially lie and then it let iteslf be corrected. No explaination why it can't count in % or /

Go on...what's "THE test report" ?
 
What the flying ****.
Why is OpenAi asking us for all that money then?
200$ for pro?
Yet this like “Tony Starks built this in a cave! WITH METAL SCRAPS!” ai is running laps around them?
I hope they feel the burn completely god damn grifters. All this money yet they moving at a snails pace!
 
By logging the data for a user it's possible to build a profile, even predict future behaviour, do precise targeting.
And then what? You'd get more of the same targeted ads you are getting anyways. I'm not even trying to say that anybody can have my data and do as they please, but the prompts I send to some LLM are really nothing I'd consider particularly private. The private messages people send over Facebook Messenger on the other hand probably are. I don't trust either.


This can be said for running anything on a server though. Streaming platforms, search engines, cloud gaming (which is much worse)

There is zero indication that for inference GPUs are running at 100% utilisation 24/7 at all.

Your local LLM example doesn't make any sense either. It's irrelevant if you process one or multiple prompts the computer scales. The LLM doesn't eat up energy just sat there in memory doing nothing - and it only uses cycles when it's used - you don't use more energy by only having a single prompt - you use what you use regardless. Playing a single game of Fortnite for 30 minutes is going to use more electricity at home than an entire day using an LLM.
The goal is to reach 100% utilization 24/7, as close as possible. They NEED to, otherwise it's literally throwing money away. Everything that sits idle is rented out, and if nobody wants more, it's used for spot computing. Every second of idle compute is lost profit for cloud providers.

And I don't think you understood my point about processing efficiency. This is about energy expenditure per token and that is by far worse for private, local compute.
For every token, the entire model data is moved from the VRAM to the cores, where the actual compute is performed. That is an energy intensive task that also takes a long time (usually this is the bottleneck for most GPUs). This is happening regardless of whether you process one single prompt or 128 prompts at once. So the more prompts you process in parallel, the more efficient the computation gets per prompt, as you don't have to do this multiple times, but only once. It's probably not quite as simple in practice, but it gives you some idea.
Also, the actual hardware used in data centers is far more efficient in terms of FLOPS/Watt inherently. Something like an H200 is literally designed for LLMs, whereas an RTX 4090 is designed for video games.

Let me also say, I'm not trying to say that you shouldn't run local LLMs. I do it, and it's super fun. I'm also sure it doesn't matter on the scale it is happening at this moment, but on a global scale, a centralized approach will always be far more efficient and ecological. It is something we'll have to eventually think about as AI's share of global energy consumption increases rapidly.
 
This is actually fantastic news for Apple. This would support the idea that if they put enough investment and manpower behind it, Apple can still catch up to, or even surpass, current AI models, and they can do it while spending less. Fingers crossed they actually execute.
This is actually a great point.
Apple probably is already there, and they ste fine tuning it so it can really destroy the competition when its finally ready.
The rest of these grifters can go to hell
 
The DeepSeek LLMs are open source. One doesn't need to use their app.
But you can't really run the full un-quantized versions on most hardware, especially current Mac hardware. While the 7b and even 32b models can be run locally, I don't think the full r1 is anywhere possible to run locally with what most of us have, even at the higher end of things.
 
What makes you think that price is the ACTUAL price? Do you believe everything that comes out of China?
Fair comment. Cost aside though it appears to be competing with the US models, and it’s open source. It’s certainly got the markets spooked! There’s definitely been a hype led bubble building around AI for a while now so this might just be puncturing some of that.
 
If I want to compromise on my personal information, like email address and the queries, I would rather trust any western private company than Chinese government.
Patriotic and loyal to your countries military/geostrategic allies stance - yet probably somewhat irrational.

I’m more likely and inclined to live, visit or be financially exposed to the U.K. or the U.S. than to China.
So are, no doubt, most people in this thread.

If anyone’s “out to get you”, it’s your country’s government and the companies that operate where you live. Not China.
 
  • Like
Reactions: Realityck
This is what actual economic annihilation looks like. $400B wiped off the markets today. That’s like 10 aircraft carriers gone.

With more high quality STEM graduates any other country in the world, this should not be a surprise.

You have tech bros like Sam Altman driving around in a Koenigsegg pretending to do innovation, gatekeep, and create a bubble. Someone’s gonna figure out it. That day was today - actually back in December when this was first released.
 
A sell-off of global technology shares has wiped up to $1 trillion off US stock markets, with the US chipmaker Nvidia falling more than 13%, and losing $465bn off its market value – the biggest such loss in US market history.
Yikes
 


A new China-based AI chatbot challenger called DeepSeek has reached the number one position on Apple's App Store free charts in multiple countries, including the US, raising questions about Silicon Valley's perceived leadership in artificial intelligence development.

deepseek-ai-app.jpeg

Released last week, the iOS app has garnered attention for its ability to match or exceed the performance of leading AI models like ChatGPT, while requiring only a fraction of the development costs, based on a research paper released on Monday.

DeepSeek has not raised money from outside funds or made significant moves to monetize its R1 model, which the company claims is on par with GPT-4o and Anthropic's Claude 3.5 Sonnet. The Chinese AI startup behind the model was founded by hedge fund manager Liang Wenfeng, who claims they used just 2,048 Nvidia H800s and $5.6 million to train R1 with 671 billion parameters, a fraction of what OpenAI and Google spent to train comparably sized models. For example, Microsoft and Meta alone have committed over $65 billion each this year largely to AI infrastructure. Just last week, OpenAI said it was creating a joint venture with Japan's SoftBank, dubbed Stargate, with plans to spend at least $100 billion on AI infrastructure in the US.

Investor Marc Andreessen is already calling DeepSeek "one of the most amazing and impressive breakthroughs" for its ability to show its work and reasoning as it addresses a user's written query or prompt. DeepSeek has also taken an open-source approach, allowing developers to freely inspect and build upon its technology.

What's particularly notable is that DeepSeek apparently achieved this breakthrough despite US export restrictions on advanced AI chips to China. The company's success suggests Chinese developers have found ways to create more efficient AI models with limited computing resources, potentially challenging the assumption that cutting-edge AI development requires massive computing infrastructure investments.

The emergence of DeepSeek has already sparked debate in Silicon Valley. While some view it as a concerning development for US technological leadership, others, like Y Combinator CEO Garry Tan, suggest it could benefit the entire AI industry by making model training more accessible and accelerating real-world AI applications.

The app's success has already impacted financial markets, with some AI-related stocks experiencing volatility as investors reconsider the necessity of extensive capital expenditure for AI development. Shares of Nvidia for example slid 10% in premarket trading on Monday on the news of DeepSeek's popularity.

Article Link: Budget AI Model DeepSeek Overtakes ChatGPT on App Store
Makes me wonder about all the billions companies are hoping to get from government $$.
 
You have tech bros like Sam Altman driving around in a Koenigsegg pretending to do innovation, gatekeep, and create a bubble. Someone’s gonna figure out it. That day was today - actually back in December when this was first released.
Sam Altman and OpenAI actually did it first. So they did innovate. If there's no ChatGPT, there's no DeepSeek. Same as if there's no iPhone, there would be no Android as we know it today. What Chinese tech did is copied already existing technology, while taking advantage of cheaper resources and likely government subsidies.
 
Released last week, the iOS app has garnered attention for its ability to match or exceed the performance of leading AI models like ChatGPT, while requiring only a fraction of the development costs, based on a research paper released on Monday.

What's particularly notable is that DeepSeek apparently achieved this breakthrough despite US export restrictions on advanced AI chips to China. The company's success suggests Chinese developers have found ways to create more efficient AI models with limited computing resources, potentially challenging the assumption that cutting-edge AI development requires massive computing infrastructure investments.
This can't be right. I was told China doesn't "have access to modern AI" and that "every single AI offering they have spit out in the last five years have been fake" because "China doesn't have access to advance AI nodes."
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.