Random Image Display on Page Reload

DeepSeek’s New AI Model Sparks Shock, Awe, and Questions From US Competitors

Jan 28, 2025 6:15 AM

DeepSeek’s New AI Model Sparks Shock, Awe, and Questions From US Competitors

Some worry the Chinese startup’s impressive tech indicates the US is losing its lead in AI, but it may really be a sign that a new approach to building models is gaining traction.

App Icon for DeepSeek with swirly blue whales

Photo-Illustration: Wired Staff; Lam Yik/Getty Images

A powerful newopen-sourceartificial intelligence model created by Chinese startup DeepSeek has shaken Silicon Valley over the past few days. Packed with cutting-edge capabilities and developed on a seemingly tiny budget, DeepSeek’s R1 is prompting talk of an impending upheaval in the tech industry.

To some people, DeepSeek’s rise signals that the US has lost its edge in AI. But a number of experts, including executives at companies that build and customize some of the world’s most powerful frontier AI models, say it's a sign of a different kind of technological transition underway.

Instead of trying to create larger and larger models that require increasingly exorbitant amounts of computing resources, AI companies are now focusing more on developing advanced capabilities, like reasoning. That has created an opening for smaller, innovative startups such as DeepSeek that haven’t received billions of dollars in outside investment. “It’s a paradigm shift towards reasoning, and that will be much more democratized,” says Ali Ghodsi, CEO of Databricks, a company that specializes in building and hosting custom AI models.

“It’s been clear for some time now that innovating and creating greater efficiencies—rather than just throwing unlimited compute at the problem—will spur the next round of technology breakthroughs,” says Nick Frosst, a cofounder of Cohere, a startup that builds frontier AI models. “This is a clarifying moment when people are realizing what's long been obvious.”

Thousands of developers and AI enthusiasts flocked to DeepSeek’s website and its official app in recent days to try out the company’s latest model and shared examples of its sophisticated capabilities on social media. Shares in US tech firms, including the chipmaker Nvidia, fell in response on Monday as investors began to question the vast sums being poured into AI development.

DeepSeek’s technology was developed by a relatively small research lab in China that sprang out of one of the country’s best-performing quantitative hedge funds. A research paper posted online last December claims that its earlier DeepSeek-V3 large language model cost only $5.6 million to build, a fraction of the amount its competitors needed for similar projects. OpenAI has previously said that some of its models cost upwards of $100 million each. The latest models from OpenAI as well as Google, Anthropic, and Meta likely cost considerably more.

The performance and efficiency of DeepSeek’s models has already prompted talk of cost cutting at some big tech firms. One engineer at Meta, who asked not to be named because they were not authorized to speak publicly, says the tech giant will most likely try to examine DeepSeek’s techniques to find ways to reduce its own expenditure on AI. “We believe open source models are driving a significant shift in the industry, and that’s going to bring the benefits of AI to everyone faster,” a spokesperson for Meta said in a statement. “We want the US to continue to be the leader in open source AI, not China, which is why Meta is developing open source AI with our Llama models which have been downloaded over 800 million times.”

The true price of developing DeepSeek’s new models remains unknown, however, since one figure quoted in a single research paper may not capture the full picture of its costs. “I don't believe it's $6 million, but even if it's $60 million, it's a game changer,” says Umesh Padval, managing director of Thomvest Ventures, a company that has invested in Cohere and other AI firms. “It will put pressure on the profitability of companies which are focused on consumer AI.”

Shortly after DeepSeek revealed the details of its latest model, Ghodsi of Databricks says customers began asking whether they could use it as well as DeepSeek’s underlying techniques to cut costs at their own organizations. He adds that one approach employed by DeepSeek’s engineers, known as distillation, which involves using the output from one large language model to train another model, is relatively cheap and straightforward.

​Padval says that the existence of models like DeepSeek's will ultimately benefit companies looking to spend less on AI, but he says that many firms may have reservations about relying on a Chinese model for sensitive tasks. So far, at least one prominent AI firm, Perplexity, has publicly announced it's using DeepSeek’s R1 model, but it says it is being hosted “completely independent of China.”

Amjad Massad, the CEO of Replit, a startup that provides AI coding tools, told WIRED that he thinks DeepSeek’s latest models are impressive. While he still finds Anthropic’s Sonnet model is better at many computer engineering tasks, he has found that R1 is especially good at turning text commands into code that can be executed on a computer. “We’re exploring using it especially for agent reasoning,” he adds.

DeepSeek’s latest two offerings—DeepSeek R1 and DeepSeek R1-Zero—are capable of the same kind of simulated reasoning as the most advanced systems from OpenAI and Google. They all work by breaking problems into constituent parts in order to tackle them more effectively, a process that requires a considerable amount of additional training to ensure that the AI reliably reaches the correct answer.

A paper posted by DeepSeek researchers last week outlines the approach the company used to create its R1 models, which it claims perform on some benchmarks about as well as OpenAI’s groundbreaking reasoning model known as o1. The tactics DeepSeek used include a more automated method for learning how to problem-solve correctly as well as a strategy for transferring skills from larger models to smaller ones.

One of the hottest topics of speculation about DeepSeek is the hardware it might have used. The question is especially noteworthy because the US government has introduced a series of export controls and other trade restrictions over the last few years aimed at limiting China’s ability to acquire and manufacture cutting-edge chips that are needed for building advanced AI.

In a research paper from August 2024, DeepSeek indicated that it has access to a cluster of 10,000 Nvidia A100 chips, which were placed under US restrictions announced in October 2022. In a separate paper from June of that year, DeepSeek stated that an earlier model it created called DeepSeek-V2 was developed using clusters of Nvidia H800 computer chips, a less capable component developed by Nvidia to comply with US export controls.

A source at one AI company that trains large AI models, who asked to be anonymous to protect their professional relationships, estimates that DeepSeek likely used around 50,000 Nvidia chips to build its technology.

Nvidia declined to comment directly on which of its chips DeepSeek may have relied on. “DeepSeek is an excellent AI advancement,” a spokesman for Nvidia said in a statement, adding that the startup's reasoning approach “requires significant numbers of Nvidia GPUs and high-performance networking.”

However DeepSeek’s models were built, they appear to show that a less closed approach to developing AI is gaining momentum. In December, Clem Delangue, the CEO of HuggingFace, a platform that hosts artificial intelligence models, predicted that a Chinese company would take the lead in AI because of the speed of innovation happening in open source models, which China has largely embraced. “This went faster than I thought,” he says.

Comment

Join the WIRED community to add comments.

Sign in or create account

Will Knight is a senior writer for WIRED, covering artificial intelligence. He writes the AI Lab newsletter, a weekly dispatch from beyond the cutting edge of AI—sign up here. He was previously a senior editor at MIT Technology Review, where he wrote about fundamental advances in AI and China’s AI … Read more
Senior Writer

Read More

OpenAI’s Operator Lets ChatGPT Use the Web for You

The company that kicked off the AI chatbot craze now wants AI to do more than just talk.
Will Knight

Why ‘Beating China’ in AI Brings Its Own Risks

The US is increasingly intent on winning the AI race with China. Experts say this ignores the benefits of collaboration—and the danger of unintended consequences.
Will Knight

Mira Murati’s AI Startup Makes First Hires, Including Former OpenAI Executive

It’s a major get for Murati’s mysterious startup, which has also poached engineers and researchers from a number of other prominent AI firms.
Zoë Schiffer

New US Rule Aims to Block China’s Access to AI Chips and Models by Restricting the World

The US government has announced a radical plan to control exports of cutting-edge AI technology to most nations.
Will Knight

Chinese AI App DeepSeek Soars in Popularity, Startling Rivals

The company said Monday it was temporarily limiting new sign-ups due to “large-scale malicious attacks” on its services.
Louise Matsakis

Nvidia’s ‘Cosmos’ AI Helps Humanoid Robots Navigate the World

Nvidia CEO Jensen Huang says the new family of foundational AI models was trained on 20 million hours of “humans walking, hands moving, manipulating things.”
Will Knight

Hands On With DeepSeek’s R1 Chatbot

DeepSeek’s chatbot with the R1 model is a stunning release from the Chinese startup. While it’s an innovation in training efficiency, hallucinations still run rampant.
Reece Rogers

DeepSeek’s Popular AI App Is Explicitly Sending US Data to China

Amid ongoing fears over TikTok, Chinese generative AI platform DeepSeek says it’s sending heaps of US user data straight to its home country, potentially setting the stage for greater scrutiny.
Matt Burgess

AI Social Media Users Are Not Always a Totally Dumb Idea

Meta’s AI characters users might seem useless, but fake social media users can sometimes offer valuable insights into real human behavior.
Will Knight

The Less People Know About AI, the More They Like It

You might assume that tech-savvy people are the most open to using AI, but research suggests it's actually those who are least familiar with it.
Chiara Longoni, Gil Appel, Stephanie Tully

That Sports News Story You Clicked on Could Be AI Slop

A new report, shared exclusively with WIRED, shows how an AI content mill with hundreds of sites managed to pull big-name advertisers into their schemes.
Kate Knibbs

Nvidia’s $3,000 ‘Personal AI Supercomputer’ Will Let You Ditch the Data Center

Nvidia CEO Jensen Huang also announced new AI tools for creating autonomous agents during a keynote address at CES.
Will Knight

*****
Credit belongs to : www.wired.com

Check Also

Saturn solidifies its title as moon king with the discovery of 128 new moons

With a whopping total of 274 moons, researchers say Saturn’s tug of war with Jupiter …