Budget AI Model DeepSeek Overtakes ChatGPT on App Store

urtules · Jan 27, 2025

SgtPepper12 said:
You do realize that all LLMs are trained on mostly the same massive piles of data scraped from the internet with very little regard for copyright, privacy or ethicality, right? Your average spaghetti recipe prompt is of absolutely no value whatsoever to DeepSeek or OpenAI, given the vastness of data that's freely available already. I'd be far more scared of Google, Microsoft and Meta, that actually have your private documents, messages and e-mails at their fingertips, just wondering whether anybody would really notice if they trained a model on that massive, untouched pool of data. Cause, boy, that would make a big difference.

By logging the data for a user it's possible to build a profile, even predict future behaviour, do precise targeting.

jimbobb24 · Jan 27, 2025

CapitalIdea said:
I saw a fascinating report on this on CNBC. They were able to achieve this with older hardware and 1/1000 the budget we are spending on our closed source (read: to be monetized at every turn) variant. Pretty soon our AI endeavors will resemble our healthcare system.

Many US LLMs are open source. The largest of which....Llama by Facebook.

dannys1 · Jan 27, 2025

SgtPepper12 said:
Just because evaluating one single prompt doesn't weigh all that much in terms of environmental impact doesn't mean that the whole operation of offering an inference API is ecologically fine. These models run on very power hungry GPUs, many thousands of them and they are all running 24/7 at close to 100% utilization. It doesn't matter whether you run training code or inference code on them, if they are used, they are used. You can be pretty sure that a model such as GPT4o has burned more energy during inference integrated over its lifetime than the training of the model has.
Additionally—especially running LLMs at home is actually an incredibly wasteful endeavor comparatively. The only reason why inference on a server farm scale is so efficient is because it can run batched, meaning one instance of the model can process multiple prompts at once. At home, you'll likely only ever process a single prompt at once, which is very inefficient energetically.

This can be said for running anything on a server though. Streaming platforms, search engines, cloud gaming (which is much worse)

There is zero indication that for inference GPUs are running at 100% utilisation 24/7 at all.

Your local LLM example doesn't make any sense either. It's irrelevant if you process one or multiple prompts the computer scales. The LLM doesn't eat up energy just sat there in memory doing nothing - and it only uses cycles when it's used - you don't use more energy by only having a single prompt - you use what you use regardless. Playing a single game of Fortnite for 30 minutes is going to use more electricity at home than an entire day using an LLM.

dannys1 · Jan 27, 2025

klasma said:
You need beefy hardware, but it's not that hard. People are running the 671B model on 32-core EPYC systems with 384 GB of RAM.

It must be extremely slow though - i'd only want to run LLMs in VRAM to be honest - as soon as it goes to system ram the tokens per second crawl to a stop.

dannys1 · Jan 27, 2025

jonnyb098 said:
How do you think they got all the info to train it for next to nothing?! 🤣

The same was everyone else creates an LLM did - all the content on the open web.

An LLM is a read only model, you using it doesn't teach them anything - if you run deepseek as a LLM that's it - it's private in your system only.

dannys1 · Jan 27, 2025

kave said:
Try asking for vaccine side effects in ChatGpT and include the test report, chargpt will initially lie and then it let iteslf be corrected. No explaination why it can't count in % or /

Go on...what's "THE test report" ?

Minato1990 · Jan 27, 2025

What the flying ****.
Why is OpenAi asking us for all that money then?
200$ for pro?
Yet this like “Tony Starks built this in a cave! WITH METAL SCRAPS!” ai is running laps around them?
I hope they feel the burn completely god damn grifters. All this money yet they moving at a snails pace!

SgtPepper12 · Jan 27, 2025

urtules said:
By logging the data for a user it's possible to build a profile, even predict future behaviour, do precise targeting.

And then what? You'd get more of the same targeted ads you are getting anyways. I'm not even trying to say that anybody can have my data and do as they please, but the prompts I send to some LLM are really nothing I'd consider particularly private. The private messages people send over Facebook Messenger on the other hand probably are. I don't trust either.

dannys1 said:
This can be said for running anything on a server though. Streaming platforms, search engines, cloud gaming (which is much worse)

There is zero indication that for inference GPUs are running at 100% utilisation 24/7 at all.

Your local LLM example doesn't make any sense either. It's irrelevant if you process one or multiple prompts the computer scales. The LLM doesn't eat up energy just sat there in memory doing nothing - and it only uses cycles when it's used - you don't use more energy by only having a single prompt - you use what you use regardless. Playing a single game of Fortnite for 30 minutes is going to use more electricity at home than an entire day using an LLM.

The goal is to reach 100% utilization 24/7, as close as possible. They NEED to, otherwise it's literally throwing money away. Everything that sits idle is rented out, and if nobody wants more, it's used for spot computing. Every second of idle compute is lost profit for cloud providers.

And I don't think you understood my point about processing efficiency. This is about energy expenditure per token and that is by far worse for private, local compute.
For every token, the entire model data is moved from the VRAM to the cores, where the actual compute is performed. That is an energy intensive task that also takes a long time (usually this is the bottleneck for most GPUs). This is happening regardless of whether you process one single prompt or 128 prompts at once. So the more prompts you process in parallel, the more efficient the computation gets per prompt, as you don't have to do this multiple times, but only once. It's probably not quite as simple in practice, but it gives you some idea.
Also, the actual hardware used in data centers is far more efficient in terms of FLOPS/Watt inherently. Something like an H200 is literally designed for LLMs, whereas an RTX 4090 is designed for video games.

Let me also say, I'm not trying to say that you shouldn't run local LLMs. I do it, and it's super fun. I'm also sure it doesn't matter on the scale it is happening at this moment, but on a global scale, a centralized approach will always be far more efficient and ecological. It is something we'll have to eventually think about as AI's share of global energy consumption increases rapidly.

CapitalIdea · Jan 27, 2025

big_papa_johnny said:
The $500bn Stargate project will not be funded by taxpayers.

Yes it will. In the forms of “tax breaks” for those data center sites, the exact same way the Foxconn Wisconsin factory debacle was really a huge giveaway.

Minato1990 · Jan 27, 2025

ThisIsMike said:
This is actually fantastic news for Apple. This would support the idea that if they put enough investment and manpower behind it, Apple can still catch up to, or even surpass, current AI models, and they can do it while spending less. Fingers crossed they actually execute.

This is actually a great point.
Apple probably is already there, and they ste fine tuning it so it can really destroy the competition when its finally ready.
The rest of these grifters can go to hell

parameter · Jan 27, 2025

klasma said:
The DeepSeek LLMs are open source. One doesn't need to use their app.

But you can't really run the full un-quantized versions on most hardware, especially current Mac hardware. While the 7b and even 32b models can be run locally, I don't think the full r1 is anywhere possible to run locally with what most of us have, even at the higher end of things.

big_papa_johnny · Jan 27, 2025

CapitalIdea said:
Yes it will. In the forms of “tax breaks” for those data center sites, the exact same way the Foxconn Wisconsin factory debacle was really a huge giveaway.

Semantic comment. Tax breaks aren't direct funding.

RemedyRabbit · Jan 27, 2025

qnssekr said:
What makes you think that price is the ACTUAL price? Do you believe everything that comes out of China?

Fair comment. Cost aside though it appears to be competing with the US models, and it’s open source. It’s certainly got the markets spooked! There’s definitely been a hype led bubble building around AI for a while now so this might just be puncturing some of that.

klasma · Jan 27, 2025

dannys1 said:
It must be extremely slow though - i'd only want to run LLMs in VRAM to be honest - as soon as it goes to system ram the tokens per second crawl to a stop.

(Edit) It depends on prompt length:

https://www.reddit.com/r/LocalLLaMA/comments/1hu8wr5/how_deepseek_v3_token_generation_performance_in

Slow, but not unusable.

AppliedMicro · Jan 27, 2025

urtules said:
If I want to compromise on my personal information, like email address and the queries, I would rather trust any western private company than Chinese government.

Patriotic and loyal to your countries military/geostrategic allies stance - yet probably somewhat irrational.

I’m more likely and inclined to live, visit or be financially exposed to the U.K. or the U.S. than to China.
So are, no doubt, most people in this thread.

If anyone’s “out to get you”, it’s your country’s government and the companies that operate where you live. Not China.

JPack · Jan 27, 2025

This is what actual economic annihilation looks like. $400B wiped off the markets today. That’s like 10 aircraft carriers gone.

With more high quality STEM graduates any other country in the world, this should not be a surprise.

You have tech bros like Sam Altman driving around in a Koenigsegg pretending to do innovation, gatekeep, and create a bubble. Someone’s gonna figure out it. That day was today - actually back in December when this was first released.

RemedyRabbit · Jan 27, 2025

A sell-off of global technology shares has wiped up to $1 trillion off US stock markets, with the US chipmaker Nvidia falling more than 13%, and losing $465bn off its market value – the biggest such loss in US market history.

Yikes

barryburek · Jan 27, 2025

MacRumors said:
A new China-based AI chatbot challenger called DeepSeek has reached the number one position on Apple's App Store free charts in multiple countries, including the US, raising questions about Silicon Valley's perceived leadership in artificial intelligence development.

Released last week, the iOS app has garnered attention for its ability to match or exceed the performance of leading AI models like ChatGPT, while requiring only a fraction of the development costs, based on a research paper released on Monday.

DeepSeek has not raised money from outside funds or made significant moves to monetize its R1 model, which the company claims is on par with GPT-4o and Anthropic's Claude 3.5 Sonnet. The Chinese AI startup behind the model was founded by hedge fund manager Liang Wenfeng, who claims they used just 2,048 Nvidia H800s and $5.6 million to train R1 with 671 billion parameters, a fraction of what OpenAI and Google spent to train comparably sized models. For example, Microsoft and Meta alone have committed over $65 billion each this year largely to AI infrastructure. Just last week, OpenAI said it was creating a joint venture with Japan's SoftBank, dubbed Stargate, with plans to spend at least $100 billion on AI infrastructure in the US.

Investor Marc Andreessen is already calling DeepSeek "one of the most amazing and impressive breakthroughs" for its ability to show its work and reasoning as it addresses a user's written query or prompt. DeepSeek has also taken an open-source approach, allowing developers to freely inspect and build upon its technology.

What's particularly notable is that DeepSeek apparently achieved this breakthrough despite US export restrictions on advanced AI chips to China. The company's success suggests Chinese developers have found ways to create more efficient AI models with limited computing resources, potentially challenging the assumption that cutting-edge AI development requires massive computing infrastructure investments.

The emergence of DeepSeek has already sparked debate in Silicon Valley. While some view it as a concerning development for US technological leadership, others, like Y Combinator CEO Garry Tan, suggest it could benefit the entire AI industry by making model training more accessible and accelerating real-world AI applications.

The app's success has already impacted financial markets, with some AI-related stocks experiencing volatility as investors reconsider the necessity of extensive capital expenditure for AI development. Shares of Nvidia for example slid 10% in premarket trading on Monday on the news of DeepSeek's popularity.

Article Link: Budget AI Model DeepSeek Overtakes ChatGPT on App Store

Makes me wonder about all the billions companies are hoping to get from government $$.

urtules · Jan 27, 2025

JPack said:
You have tech bros like Sam Altman driving around in a Koenigsegg pretending to do innovation, gatekeep, and create a bubble. Someone’s gonna figure out it. That day was today - actually back in December when this was first released.

Sam Altman and OpenAI actually did it first. So they did innovate. If there's no ChatGPT, there's no DeepSeek. Same as if there's no iPhone, there would be no Android as we know it today. What Chinese tech did is copied already existing technology, while taking advantage of cheaper resources and likely government subsidies.

klasma · Jan 27, 2025

parameter said:
But you can't really run the full un-quantized versions on most hardware, especially current Mac hardware. While the 7b and even 32b models can be run locally, I don't think the full r1 is anywhere possible to run locally with what most of us have, even at the higher end of things.

Right, not on Mac hardware. You need more than 128 GB RAM, but it doesn’t necessarily need to be VRAM:

https://www.reddit.com/r/LocalLLaMA/comments/1i8y1lx/comment/m8zgwi1

dannys1 · Jan 27, 2025

klasma said:
(Edit) It depends on prompt length:
https://www.reddit.com/r/LocalLLaMA/comments/1hu8wr5/how_deepseek_v3_token_generation_performance_in

Slow, but not unusable.

That's unusable for me.

Minato1990 · Jan 27, 2025

OpenAi deserves the same fate as Netscape
Lazy grifting punks.

Apple Fan 2008 · Jan 27, 2025

God of Biscuits said:
the US has no data privacy laws at all.

There are state level data privacy laws in places like California and Florida. And there's protections for minors like COPPA.

EM2013 · Jan 27, 2025

The AI obsession continues 🙄

sw1tcher · Jan 27, 2025

Released last week, the iOS app has garnered attention for its ability to match or exceed the performance of leading AI models like ChatGPT, while requiring only a fraction of the development costs, based on a research paper released on Monday.

What's particularly notable is that DeepSeek apparently achieved this breakthrough despite US export restrictions on advanced AI chips to China. The company's success suggests Chinese developers have found ways to create more efficient AI models with limited computing resources, potentially challenging the assumption that cutting-edge AI development requires massive computing infrastructure investments.

This can't be right. I was told China doesn't "have access to modern AI" and that "every single AI offering they have spit out in the last five years have been fake" because "China doesn't have access to advance AI nodes."

Budget AI Model DeepSeek Overtakes ChatGPT on App Store

macrumors 6502

macrumors 68040

macrumors 68040

macrumors 68040

macrumors 68040

macrumors 68040

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors regular

macrumors newbie

macrumors regular

macrumors G3

macrumors 68040

macrumors G5

macrumors regular

macrumors newbie

macrumors 6502

macrumors G3

macrumors 68040

macrumors 6502a

macrumors 68000

macrumors 68030

macrumors 604

Our Staff