Chinese startup DeepSeek has launched its latest AI models, claiming they match or exceed the capabilities of leading U.S. models at a significantly lower cost. This development threatens to disrupt the global technology landscape.
The company gained attention in the AI community after stating in a paper last month that training its DeepSeek-V3 model required less than $6 million in computing power using Nvidia H800 chips.
DeepSeek’s AI Assistant, powered by DeepSeek-V3, has surpassed ChatGPT to become the top-rated free application on Apple’s App Store in the United States.
This success has raised questions about the rationale behind the significant investments pledged by some U.S. tech companies in AI. Consequently, shares of several major tech firms, including Nvidia, have experienced declines.
DeepSeek is causing a stir for several reasons.
What’s behind the buzz surrounding DeepSeek?
The launch of OpenAI’s ChatGPT in late 2022 prompted a rush among Chinese tech companies to develop their own AI chatbots.
However, the initial offerings, such as Baidu’s chatbot, disappointed many in China, highlighting a perceived gap in AI capabilities between U.S. and Chinese firms.
DeepSeek’s models have changed this narrative. The startup claims that its DeepSeek-V3 and DeepSeek-R1 models are comparable to the most advanced offerings from OpenAI and Meta.
Additionally, DeepSeek’s models are more cost-effective to use. The DeepSeek-R1, released last week, is reported to be 20 to 50 times cheaper than OpenAI’s latest model, depending on the task, according to a post on DeepSeek’s official WeChat account.
Despite this, some industry figures have expressed skepticism about DeepSeek’s achievements.
Scale AI CEO Alexandr Wang claimed in a CNBC interview that DeepSeek possesses 50,000 Nvidia H100 chips, although he did not provide evidence. He suggested that disclosing this information would violate U.S. export controls prohibiting advanced AI chips from being sold to Chinese companies. DeepSeek has not yet responded to this allegation.
Bernstein analysts noted that the total training costs for DeepSeek’s V3 model remain unknown and are likely much higher than the $5.58 million the startup reported for computing power. They also mentioned that the training costs for the R1 model have not been disclosed.
Who is the creator of DeepSeek?
DeepSeek is based in Hangzhou and is controlled by Liang Wenfeng, who co-founded the quantitative hedge fund High-Flyer.
In March 2023, Liang’s fund announced on its WeChat account that it was “starting again,” shifting focus from trading to establishing a “new and independent research group” aimed at exploring the essence of Artificial General Intelligence (AGI). DeepSeek was founded later that year.
OpenAI defines AGI as autonomous systems that can outperform humans in most economically valuable tasks.
It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer shares an office building with DeepSeek and holds patents related to chip clusters used for training AI models, according to Chinese corporate records.
The AI unit of High-Flyer stated on its WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips.
What is Beijing’s perspective on DeepSeek?
Beijing has taken notice of DeepSeek’s success. On January 20, the same day DeepSeek-R1 was released, founder Liang attended a closed-door symposium for business leaders and experts hosted by Chinese Premier Li Qiang, according to state news agency Xinhua.
Liang’s participation may indicate that DeepSeek’s achievements align with Beijing’s goal of overcoming U.S. export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year included Baidu CEO Robin Li.