Why DeepSeek’s new AI model thinks it’s ChatGPT
Earlier this week, DeepSeek, a well-funded Chinese AI lab, released an “open” AI model that outperforms many competitors on popular benchmarks. model, Deep Sec V3large but powerful, handles text tasks like programming and writing articles with ease.
He also seems to think so ChatGPT.
Supports on X — and TechCrunch’s own tests — show that DeepSeek V3 identifies itself as ChatGPT, OpenAI’s AI-powered chatbot platform. When asked for more details, DeepSeek V3 insisted that it was a copy of OpenAI GPT-4 Model released in 2023.
This is actually proliferating as of today. In 5 out of 8 generations, DeepSeekV3 claims to be ChatGPT (v4), while only 3 times it claims to be DeepSeekV3.
It gives you a rough idea of some distribution of their training data. https://t.co/Zk1KUppBQM pic.twitter.com/ptIByn0lcv
– Lucas Baer (bl16) (@giffmana) December 27, 2024
Delusions run deep. If you ask DeepSeek V3 a question about the DeepSeek API, it will give you instructions on how to use OpenAI API. Even DeepSeek V3 tells us some of the same thing Jokes Like GPT-4 – all the way down to the final lines.
So what happens?
Models like ChatGPT and DeepSeek V3 are statistical systems. By training them on billions of examples, they learn patterns in those examples to make predictions — such as how the phrase “to whom” in an email typically precedes the phrase “may be of interest to you.”
DeepSeek hasn’t revealed much about the source of the training data for DeepSeek V3. But there is There is no shortage From public datasets containing text generated by GPT-4 over ChatGPT. If DeepSeek V3 had been trained on this, the model might have memorized some of the GPT-4 output and is now rewinding it verbatim.
“It’s clear that the model is seeing initial ChatGPT responses at some point, but it’s not clear where,” Mike Cook, a research fellow at King’s College London who specializes in artificial intelligence, told TechCrunch. “It may be ‘accidental’… but unfortunately, we have seen cases where people have trained their models directly on the output of other models to try to leverage their knowledge.”
Cook noted that the practice of training models on the outputs of competing AI systems can be “very bad” for the quality of the models, because it can lead to hallucinations and misleading answers like the above. “Like taking a photocopy, we lose more and more information and connection to reality,” Cook said.
This may also be against the terms of service of those systems.
OpenAI’s terms prohibit users of its products, including ChatGPT customers, from using the Output to develop models that compete with OpenAI’s models.
OpenAI and DeepSeek did not immediately respond to requests for comment. However, Sam Altman, CEO of OpenAI, published what appears to be… holes At DeepSeek and other X Friday competitors.
“It is (relatively) easy to imitate something you know works,” Altman wrote. “It’s very difficult to do something new, risky and difficult when you don’t know whether it’s going to work or not.”
The DeepSeek V3 is certainly not the first model to misidentify itself. Google Gemini and others sometimes They claim to be competing models. For example, demanding Mandarin, Gemini He says It is the Wenxinyiyan chatbot of the Chinese company Baidu.
That’s because the Internet, where AI companies source the bulk of their training data, has become so Scattered With artificial intelligence slope. Content farms use AI in creation Clickbait. Robots are overwhelmed Reddit and X. With one appreciation90% of the web could be created by AI by 2026.
This “pollution,” so to speak, has worked Very difficult To precisely filter AI output from training datasets.
Sure enough, DeepSeek trained DeepSeek V3 directly on the text generated by ChatGPT. Google once was accused To do the same, after all.
Heidi Khallaf, a senior AI scientist at the nonprofit AI Now Institute, said the cost savings from “distilling” existing model knowledge could be attractive to developers, regardless of the risks.
“Even with Internet data now filled with AI outputs, other models that will be mistakenly trained on ChatGPT or GPT-4 outputs will not necessarily show outputs reminiscent of custom OpenAI messages,” Khallaf said. “If it were the case that DeepSeek partially distilled using OpenAI models, it would not be surprising.”
But it’s more likely that a lot of ChatGPT/GPT-4 data made it into the DeepSeek V3 training set. This means that the model cannot be trusted to determine its identity by itself. But more troubling is the possibility that DeepSeek V3, by uncritically ingesting and replicating GPT-4 output, might be able to. Aggravation Some of the model Biases and flaws.
TechCrunch has an AI-focused newsletter! Register here Get it in your inbox every Wednesday.