Thu, 30 Jan 2025 11:31:35 GMT
Dado Ruvic | 路透社本周,中国人工智能公司深度求索(DeepSeek)震撼市场,宣称其新AI模型性能超越OpenAI,且构建成本仅为后者的一小部分。特别是,深度求索声称其大型语言模型的训练成本仅为560万美元,这引发了业界对科技巨头当前在训练和运行高级AI工作负载所需计算基础设施上投入巨额资金的担忧。然而,并非所有人都对深度求索的说法信服。CNBC采访了行业专家,了解他们对深度求索的看法,以及它与OpenAI——引发AI革命的爆款聊天机器人ChatGPT的创造者——的实际对比情况。
**深度求索是什么?**
上周,深度求索发布了其新的推理模型R1,与OpenAI的o1相抗衡。推理模型是一种大型语言模型,它将提示分解成更小的部分,并在生成响应前考虑多种方法,旨在以类似人类的方式处理复杂问题。深度求索由专注于AI的量化对冲基金High-Flyer联合创始人梁文峰于2023年创立,致力于大型语言模型的研究,并追求实现人工通用智能(AGI)。AGI这一概念大致指的是AI在广泛任务上达到或超越人类智能的构想。
R1背后的技术大多并非全新,但值得注意的是,深度求索是首个将其应用于高性能AI模型的公司,并据称显著降低了电力需求。欧亚集团地缘科技实践主任吕晓萌表示:“关键点在于,发展这一行业有许多可能性。高端芯片/资本密集型方式是一种技术路径,但深度求索证明我们仍处于AI发展的初期阶段,OpenAI确立的道路可能并非通往强大AI的唯一途径。”
**深度求索与OpenAI有何不同?**
深度求索有两个主要系统引起了AI社区的关注:V3,支撑其产品的大型语言模型;以及R1,其推理模型。这两个模型都是开源的,意味着它们的底层代码是免费且公开的,供其他开发者定制和再分发。深度求索的模型比许多其他大型语言模型小得多。V3总共有6710亿个参数,即模型在训练过程中学习的变量。而OpenAI虽未公开参数数量,专家估计其最新模型至少有万亿级别。
在性能方面,深度求索表示其R1模型在推理任务上的表现与OpenAI的o1相当,引用了包括AIME 2024、Codeforces、GPQA Diamond、MATH-500、MMLU和SWE-bench Verified在内的基准测试。在一份技术报告中,公司称其V3模型的训练成本仅为560万美元,远低于OpenAI、Anthropic等知名西方AI实验室为训练和运行其基础AI模型所花费的数十亿美元。不过,深度求索的运行成本尚不明确。如果训练成本准确,这意味着该模型的开发成本远低于OpenAI、Anthropic、谷歌等竞争对手的模型。
科技洞察公司The Futurum Group的首席执行官丹尼尔·纽曼表示,这些进展暗示着“重大突破”,尽管他对具体数字持保留态度。“我相信深度求索的突破标志着扩展法则的一个重要转折点,并且是真正的需求,”他说,“话虽如此,围绕这些技术的许多问题和不确定性仍然存在。” he full picture of costs as it pertains to the development of DeepSeek.”Meanwhile, Paul Triolio, senior VP for China and technology policy lead at advisory firm DGA Group, noted it was difficult to draw a direct comparison between DeepSeek’s model cost and that of major U.S. developers.”The 5.6 million figure for DeepSeek V3 was just for one training run, and the company stressed that this did not represent the overall cost of R&D to develop the model,” he said. “The overall cost then was likely significantly higher, but still lower than the amount spent by major US AI companies.” DeepSeek wasn’t immediately available for comment when contacted by CNBC.Comparing DeepSeek, OpenAI on priceDeepSeek and OpenAI both disclose pricing for their models’ computations on their websites.DeepSeek says R1 costs 55 cents per 1 million tokens of inputs — “tokens” referring to each individual unit of text processed by the model — and $2.19 per 1 million tokens of output.In comparison, OpenAI’s pricing page for o1 shows the firm charges $15 per 1 million input tokens and $60 per 1 million output tokens. For GPT-4o mini, OpenAI’s smaller, low-cost language model, the firm charges 15 cents per 1 million input tokens.Skepticism over chipsDeepSeek’s reveal of R1 has already led to heated public debate over the veracity of its claim — not least because its models were built despite export controls from the U.S. restricting the use of advanced AI chips to China. DeepSeek claims it had its breakthrough using mature Nvidia clips, including H800 and A100 chips, which are less advanced than the chipmaker’s cutting-edge H100s, which can’t be exported to China.However, in comments to CNBC last week, Scale AI CEO Alexandr Wang, said he believed DeepSeek used the banned chips — a claim that DeepSeek denies.watch nowVIDEO6:1206:12LinkedIn co-founder Reid Hoffman: DeepSeek AI proves this is now a ‘game-on competition’ with ChinaSquawk BoxNvidia has since come out and said that the GPUs that DeepSeek used were fully export-compliant.The real deal or not?Industry experts seem to broadly agree that what DeepSeek has achieved is impressive, although some have urged skepticism over some of the Chinese company’s claims.”DeepSeek is legitimately impressive, but the level of hysteria is an indictment of so many,” U.S. entrepreneur Palmer Luckey, who founded Oculus and Anduril wrote on X.”The $5M number is bogus. It is pushed by a Chinese hedge fund to slow investment in American AI startups, service their own shorts against American titans like Nvidia, and hide sanction evasion.”Seena Rejal, chief commercial officer of NetMind, a London-headquartered startup that offers access to DeepSeek’s AI models via a distributed GPU network, said he saw no reason not to believe DeepSeek.”Even if it’s off by a certain factor, it still is coming in as greatly efficient,” Rejal told CNBC in a phone interview earlier this week. “The logic of what they’ve explained is very sensible.”However, some have claimed DeepSeek’s technology might not have been built from scratch.”DeepSeek makes the same mistakes O1 makes, a strong indication the technology was ripped off,” billionaire investor Vinod Khosla said on X, without giving more details.It’s a claim that OpenAI itself has alluded to, telling CNBC in a statement Wednesday that it is reviewing reports DeepSeek may have “inappropriately” used output data from its models to develop their AI model, a method referred to as “distillation.””We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the U.S. government to protect the most capable models being built here,” an OpenAI spokesperson told CNBC.Commoditization of AIHowever the scrutiny surrounding DeepSeek shakes out, AI scientists broadly agree it marks a positive step for the industry.Yann LeCun, chief AI scientist at Meta, said that DeepSeek’s success represented a victory for open-source AI models, not necessarily a win for Chin 在美国,Meta公司推出了一款名为Llama的流行开源AI模型。对于看到DeepSeek表现并认为“中国在AI领域正超越美国”的人们,你们理解错了。正确的解读应该是:“开源模型正在超越专有模型”,他在LinkedIn的一篇帖子中这样写道。
“DeepSeek从开放研究和开源(如Meta的PyTorch和Llama)中获益。他们提出了新想法,并在他人工作的基础上进行了构建。由于他们的工作是公开发布且开源的,每个人都能从中受益。这就是开放研究和开源的力量。”
观看视频:为何DeepSeek正威胁美国的AI领先地位
立即观看
视频时长:40:24
40:24
中国DeepSeek如何危及美国的AI领导地位
TechCheck专题报道
– CNBC的Katrina Bishop和Hayden Field对本报道亦有贡献
原文链接:https://www.cnbc.com/2025/01/30/chinas-deepseek-has-some-big-ai-claims-not-all-experts-are-convinced-.html