Latest in AI

Showing:grpoDevelopersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Hugging Face 發表 TRL v1.0：專為後訓練（Post-Training）打造的開源庫，邁向 API 穩定與高效對齊新里程碑★ 85
Hugging Face Blog119 days agoRelease
Hugging Face has officially announced the release of TRL (Transformer Reinforcement Learning) v1.0. This is a major milestone, marking TRL's transformation…
「DeepSeek 時刻」一週年：開源 AI 的典範轉移與變革回顧★ 85
Hugging Face Blog189 days agoCommentary
The DeepSeek-V3 and R1 models released in January 2025 have been hailed as the "DeepSeek Moment" in the AI world. This upheaval not only shattered the myth…
🐯 Liger GRPO 攜手 TRL：大幅降低 DeepSeek-R1 式強化學習訓練顯存與加速★ 82
Hugging Face Blog429 days agoNew Tool
Since the explosive rise of DeepSeek-R1, GRPO (Group Relative Policy Optimization) has become the most widely discussed reinforcement learning (RL) technique…
Hugging Face 發布 Open R1 第四次更新：開源推理模型訓練的最新進展與最佳化★ 85
Hugging Face Blog488 days agoRelease
Hugging Face's Open R1 project aims to fully open-source and replicate the training pipeline of DeepSeek-R1's reasoning model. In the latest fourth update…
Open R1 第三次更新：Hugging Face 釋出開源推理模型與 GRPO 訓練優化細節★ 85
Hugging Face Blog503 days agoRelease
Since its launch, Hugging Face's Open R1 project has been dedicated to replicating the reasoning capabilities of DeepSeek-R1 in a fully open-source manner. In…
Open R1 更新第二彈：Hugging Face 複製 DeepSeek-R1 的最新進展與強化學習實踐★ 85
Hugging Face Blog533 days agoRelease
Hugging Face has officially published the second technical update (Update #2) for the Open R1 project, which aims to replicate DeepSeek-R1's reasoning model…
Hugging Face 發布 Open-R1 首個更新：開源重現 DeepSeek-R1 的進展與挑戰★ 85
Hugging Face Blog541 days agoRelease
### Background and the Goals of the Open-R1 Project Since the release of DeepSeek-R1, its powerful reasoning capability and remarkably low training cost have…
Open-R1：Hugging Face 推出完全開源的 DeepSeek-R1 重現計劃★ 90
Hugging Face Blog546 days agoRelease
### Project Background: Recreating the Open-Source Miracle of DeepSeek-R1 The emergence of DeepSeek-R1 sent shockwaves through the global AI community…