Re: [新聞] OpenAI:已掌握DeepSeek盜用模型證據

作者: Lushen (wind joker!!!)   2025-01-30 08:59:21
OpenAPI 的 Chief Research Officer (首席研究員)
Mark Chen 2025/01/29 凌晨發了一波推文評價 Deepseek R1 的論文
https://i.imgur.com/A73X07x.png
https://i.imgur.com/rjDczVH.png
恭喜 DeepSeek 產出了一個 o1 級別的推理模型!他們的研究論文顯示,他們獨立發現了
一些我們在通往 o1 道路上所找到的核心理念。
不過,我認為外界的反應有些過度,特別是在成本相關的敘事上。擁有兩種範式(預訓練
和推理)的一個重要影響是,我們可以在兩個軸向上最佳化能力,而不是只有一個,這將
帶來更低的成本。
但這也意味著我們有兩個可以擴展的軸向,而我們計劃在這兩個方向上都積極投入算力!
隨著蒸餾(distillation)技術的成熟,我們也看到降低成本和提升能力這兩者之間的關
係越來越解耦。能夠以更低的成本提供服務(尤其是在較高延遲的情況下),並不代表能
夠產生更強的能力。
我們將持續改進模型的低成本部署能力,但我們仍對研究路線保持樂觀,並將專注於執行
計劃。我們很興奮能在本季度及今年內推出更優秀的模型!
Congrats to DeepSeek on producing an o1-level reasoning model! Their research
paper demonstrates that they’ve independently found some of the core ideas
that we did on our way to o1.
However, I think the external response has been somewhat overblown,
especially in narratives around cost. One implication of having two paradigms
(pre-training and reasoning) is that we can optimize for a capability over
two axes instead of one, which leads to lower costs.
But it also means we have two axes along which we can scale, and we intend to
push compute aggressively into both!
As research in distillation matures, we're also seeing that pushing on cost
and pushing on capabilities are increasingly decoupled. The ability to serve
at lower cost (especially at higher latency) doesn't imply the ability to
produce better capabilities.
We will continue to improve our ability to serve models at lower cost, but we
remain optimistic in our research roadmap, and will remain focused in
executing on it. We're excited to ship better models to you this quarter and
over the year!
作者: tsubasawolfy (悠久の翼)   2025-01-30 09:06:00
GPT的mini系列可以更省吧
作者: MyPetTankDie   2025-01-30 09:27:00
效率化是好事
作者: redbeanbread (尋找)   2025-01-30 09:28:00
急了5090不夠賣吧
作者: dongdong0405 (聿水)   2025-01-30 09:47:00
OpenAI裡也有中國人 中又贏
作者: herculus6502 (金麟豈是池中物)   2025-01-30 09:54:00
首先,你要釀得出酒來

Links booklink

Contact Us: admin [ a t ] ucptt.com