OpenAI Launches o3-mini, a Cost-Efficient Reasoning Model Rivaling DeepSeek

redheeler · Feb 2, 2025

macintologist said:
You can ask it about Tiananmen Square massacre and it will answer it. This alone makes it superior to DeepSeek

I’d never ask DeepSeek any political questions about China or any other country. Likewise for ChatGPT. A good old fashioned web search is still far better for some things, especially as it makes clear the different biases and interests involved, compared to using an AI tool which doesn’t disclose its sources.

Should always go without saying, do your own research and form your own opinions. I’m glad DeepSeek is at least raising awareness of how biased AI can be. Maybe people will generalize this to all AI tools, as they should.

TechnoMonk · Feb 2, 2025

jole said:
Drive towards $0.5T datacenter investment is a plan for being the first with capacity to host super-intelligent AI agents in scale that will replace large share of the workforce globally. While an extremely bold bet, the lead it could give US over China may be pivotal as it is possible that this "datacenter full with geniuses" leads to further acceleration that keeps rivals behind.

If I would be EU I would be terrified. But EU seems clueless.

China seems to understand this and is taking steps to mitigate. But there are limits — DeepSeek was merely able to build a ~ $1B datacenter. Export controls are hurting them. I fear all this will lead to a hot war with Taiwan.

Necessity is the mother of invention, the restriction put on China accelerated the need for efficient models with less compute. What’s to say china isn’t working on building their own chips. Sam Altman was talking about 7T, I just hear crickets now. The race has not even begun, we are in mid nineties of internet boom. Google of AI has not yet entered. We are still in the Netscape/ AOL/ yahoo era of AI.

throAU · Feb 2, 2025

jole said:
How many tokens/s? (with the largest R1)

Not many, look it up on YouTube. But it can be done.

Arsenikdote · Feb 2, 2025

Grey Area said:
I don't think the case is so clear-cut for OpenAI. Their operating costs are immense, anyone using GPT costs them money. Only about 5% of users pay subscriptions, and even the $200/month pro plan is not profitable. The company runs an annual loss of 5 billion dollars, and they were on the verge of bankruptcy in late 2024. Big user numbers can impress investors for a while, but eventually they will want to see this translate into paying customers and profits, and OpenAI does not seem to know how to get there.

The investors who saved OpenAI do not necessarily believe in a future for OpenAI: Nvidia wants to keep the AI hype going in order to sell GPUs, and OpenAI collapsing would damage the hype. Microsoft has effectively owned OpenAI for years, and OpenAI dying would make them look bad. Meanwhile they struggle to get some profit from this investment, as users do not want to pay for the OpenAI-based Copilot shoehorned into Windows and Office.

I agree about businesses and their models being vague. Investors have created this paradigm and it can work or not, but can be very hard to predict. Amazon lost a lot of money. WeWork lost a lot of money. One of them is a lot more profitable now than the other

Substance90 · Feb 2, 2025

ayale99 said:
Still not Open Source.

Deepseek isn't open source either. The binary blob which contains the trained model is free (of charge) but that's about it. The source code for training it isn't available.

Stiksi · Feb 3, 2025

Arsenikdote said:
I don't agree. Any user of a platform makes it more valuable. If people feel against a product enough to publicly call it out, unprompted, then I think they should boycott it to back up their words. They are providing value to the platform, even at a small scale, just as commercial entities are, just as personal use with paid subscriptions are, etc...

Basically I think if personal users are going to open their mouth and complain without anyone asking for their opinion (Internet forum) they need to put their money and actions where their open mouth is and not use these services.

By one personal user boycotting the systems it might not make massive waves in getting to the "fair" deal you are wanting. However, many "one" users might and it certainly can't hurt. Also, everyone "doing their part" certainly doesn't distract from the conversation either. Not a zero sum equation, everything can contribute and help, maybe even a 1+1+1=5 type scenario?

But I also hope you are enjoying your weekend!

I get your point, but the way I see it, AI is so ubiquitous now that you’d have to point he finger at basically everyone. I don’t use AI, because I have my own company and get to make that choice, but my wife has to use it for work – it is already a required tool for a lot of people and most creative job listings mention AI use now. The time for effective boycotts has passed and the only hope now is to pressure lawmakers to pass regulation and update copyright laws to explicitly define AI training as commercial use.

I mean I wish boycotting would work, but I don’t see it happening anymore. On the other hand, personal users rarely pay anything, so they also cost the AI company server time. I see that as a silver lining.

But anyway, I’m not telling you what to do and I like your optimism. I hope you prove me wrong!

Kazgarth · Feb 3, 2025

Substance90 said:
Deepseek isn't open source either. The binary blob which contains the trained model is free (of charge) but that's about it. The source code for training it isn't available.

you don't understand what open source is.

Definition of Open Source: open source software is software that can be freely used, modified, and shared. The key here is the availability and freedom to use the software itself, not necessarily the obligation to release the tools or processes used to create it.

DeepSeek-R1 is released under the MIT License, which aligns with OSI's definition of open source. This license allows users to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software.

sdugoten · Feb 3, 2025

I don't oppose open source, but I don't think open source would work on these kind of highly expensive project. It takes a lot of money to run, it takes a lot of man hours to develop. Even it let you to run the software locally, somehow someone need to use some kind of method to fund the developer to continue the development. Open source have it's limitation, especially for these kind of highly expensive project.

If they can continue to develop the software without known source of funding, and their server is still running for free , something is really wrong.

I mean...when is the last time you do it for free in your profession for a prolong period of time 9 hours a day and 5 days a week and that job is the only source of your income?

throAU · Feb 3, 2025

So, been playing with o3 mini, also check out what it can do:

Also, here's what is coming in the next month or so

Automated research with citations, etc.

Stuff that would take a human researcher hours can be done in minutes.

jole · Feb 3, 2025

Kazgarth said:
you don't understand what open source is.

Definition of Open Source: open source software is software that can be freely used, modified, and shared. The key here is the availability and freedom to use the software itself, not necessarily the obligation to release the tools or processes used to create it.

DeepSeek-R1 is released under the MIT License, which aligns with OSI's definition of open source. This license allows users to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software.

Model weight != Source. Confusion originates from the fact that models are not "programmed".

In practical sense, open weight models like R1 that also do have their inference code open sourced; are open for anyone to use and build on.

If a training algorithm would not be open source, that might not matter as much as the research paper explaining how training is done. That research can be used by others to build their models. In any case — training a model needs a lot of compute and data — neither available for general public.

Substance90 · Feb 10, 2025

Kazgarth said:
you don't understand what open source is.

Definition of Open Source: open source software is software that can be freely used, modified, and shared. The key here is the availability and freedom to use the software itself, not necessarily the obligation to release the tools or processes used to create it.

DeepSeek-R1 is released under the MIT License, which aligns with OSI's definition of open source. This license allows users to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software.

Except you can't modify anything, can you? You have a binary blob which represents the trained model. It's like releasing a compiled .exe as MIT licensed. You still can't change it in any way or see its source.

ArPe · Feb 10, 2025

throAU said:
So, been playing with o3 mini, also check out what it can do:

Also, here's what is coming in the next month or so

That’s fine but I don’t watch YouTube videos if the tuber has ridiculous thumbnails.

throAU said:
Automated research with citations, etc.

All language models produce hallucinations and fake citations that need to be triple checked. That issue isn’t going away any time soon.

throAU said:
Stuff that would take a human researcher hours can be done in minutes.

Just because something is done fast doesn’t mean it’s done well. That’s like making the assumption that fast fashion is the same as high quality clothing and that fast junk food is the same as healthy homemade style cooking.

High quality anything requires time and skill. Sure, machines can shave some of that time but the more machinery you use the more quality is lost. At that point you are now getting highly processed synthetic slop.

A mentally healthy person says ‘researcher’ and not ‘human researcher’. We do not refer to our own species in the third person. AI guys tend to believe they are extraterrestrials or above everyone else, so they refer to working people as ‘those pesky humans’.

ArPe · Feb 10, 2025

Substance90 said:
Except you can't modify anything, can you? You have a binary blob which represents the trained model. It's like releasing a compiled .exe as MIT licensed. You still can't change it in any way or see its source.

They can do distills and merges. They don’t have all the code and data sets used to make Deepseek V3 or R1 or Zero.

All the local Deepseek models that people are running with less than 96GB VRAM are not even Deepseek. They are Qwen or Llama distills. The architecture reported inside Ollama will tell them it is not Deepseek. People read some bad reporting or silly internet comments and assume they have some ChatGPT level model they can now run on their home computer. They can’t.

ArPe · Feb 10, 2025

redheeler said:
I’d never ask DeepSeek any political questions about China or any other country. Likewise for ChatGPT. A good old fashioned web search is still far better for some things

Web searches should always be the primary source for information online and Wikipedia being the top result.

However, if all these companies are pushing their language models as information look up tools they should not have any political, social or historic censorship in them at all. That’s a slippery slope, because some companies have an ultimate goal of the ‘AI’ replacing the search engine.

throAU · Feb 10, 2025

ArPe said:
All language models produce hallucinations and fake citations that need to be triple checked. That issue isn’t going away any time soon.

So do humans.

ArPe said:
Just because something is done fast doesn’t mean it’s done well.

Various PhD scientists confirm that the results they're getting out of Deep Research (which are fully cited, so you can validate) are extremely good.

The point with Deep Research is that it isn't "AI generated slop". The language model (which understands language) is able to process web search information much quicker than a human. Your sources are still going to be the internet, but what will take a human hours to do is processed in minutes.

This is not the AI generating its own content from its internal model. This is the AI pulling fully cited public sources together and summarising.

You can stick your head in the sand and be left behind, or you can keep up.

Good luck.

throAU · Feb 10, 2025

ArPe said:
Web searches should always be the primary source for information online and Wikipedia being the top result.

If that's what you think.... lol.

I'd much rather my science be pulled from peer reviewed journals, etc.

ArPe · Feb 10, 2025

throAU said:
So do humans.

You mean ‘so do people’?

When a language model makes fabricates URLs it is because they are like slot machines that will produce random filler contents, or what is called ‘hallucinations’.

Professional researchers don’t insert fake URLs into scientific papers and professional documents. If you do that it’s on you. Don’t drag the vast majority of people into your personal lack of standards and skills.

throAU said:
You can stick your head in the sand and be left behind, or you can keep up.

Good luck.

There you go making the cult-like assumption that someone who says something that seems negative about AI must be an unbeliever and ignorant.

Go look in the mirror and see the person you have become thanks to chatbots.

ArPe · Feb 10, 2025

throAU said:
If that's what you think.... lol.

I'd much rather my science be pulled from peer reviewed journals, etc.

What do you think those links at the bottom of a Wikipedia page are?

Ask a chatbot if you haven’t got the ability to answer that yourself, since you seem to believe chatbots are so much better than ‘humans’.

throAU · Feb 10, 2025

ArPe said:
When a language model makes fabricates URLs it is because they are like slot machines that will produce random filler contents, or what is called ‘hallucinations’.

you’ve clearly not got a clue about what deep research does.

ArPe · Feb 10, 2025

throAU said:
you’ve clearly not got a clue about what deep research does.

Sure. Great way to bottle and run when you’ve been caught looking like you don’t know the subject you act like you’re an expert on.

Next time you compare a ‘human’ to a chatbot, please tell us when a ‘human’ makes a mistake does this happen:

ERROR DUMP:

llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'
llama_model_load_from_file: failed to load model
17:14:52-135613 ERROR Failed to load the model.
Traceback (most recent call last):
File "C:\AI\text-generation-webui-main\modules\ui_model_menu.py", line 214, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\text-generation-webui-main\modules\models.py", line 90, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\text-generation-webui-main\modules\models.py", line 280, in llamacpp_loader
model, tokenizer = LlamaCppModel.from_pretrained(model_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\text-generation-webui-main\modules\llamacpp_model.py", line 111, in from_pretrained
result.model = Llama(**params)
^^^^^^^^^^^^^^^
File "C:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\llama_cpp_cuda_tensorcores\llama.py", line 369, in init
internals.LlamaModel(
File "C:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\llama_cpp_cuda_tensorcores_internals.py", line 56, in init
raise ValueError(f"Failed to load model from file: {path_model}")
ValueError: Failed to load model from file: models\Deepseek-R1-Qwen-32b-Q5_K_M_GGUF\DeepSeek-R1-Distill-Qwen-32B-Q5_K_M.gguf

Exception ignored in: <function LlamaCppModel.__del__ at 0x000002363D489120>
Traceback (most recent call last):
File "C:\AI\text-generation-webui-main\modules\llamacpp_model.py", line 62, in del
del self.model
^^^^^^^^^^
AttributeError: 'LlamaCppModel' object has no attribute 'model'

redheeler · Feb 10, 2025

ArPe said:
Web searches should always be the primary source for information online and Wikipedia being the top result.

However, if all these companies are pushing their language models as information look up tools they should not have any political, social or historic censorship in them at all. That’s a slippery slope, because some companies have an ultimate goal of the ‘AI’ replacing the search engine.

There’s going to be bias and this is completely unavoidable, because the data sources the AI is trained on contain bias. Disclosing the sources would be a good step for these AI companies, but they don’t want to admit how much data they stole so they’ll never do that.

OpenAI Launches o3-mini, a Cost-Efficient Reasoning Model Rivaling DeepSeek

macrumors G3

macrumors 68040

macrumors G4

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors 6502

macrumors regular

macrumors G4

macrumors 6502a

macrumors 6502a

Suspended

Suspended

Suspended

macrumors G4

macrumors G4

Suspended

Suspended

macrumors G4

Suspended

macrumors G3

Our Staff