Latest in AI

Showing:rlhfStudentsClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Nathan Lambert 的最新進展：ATOM Report、Post-Training 課程、新書與持續進行的 AI 研究★ 70
Interconnects (Nathan L.)104 days agoRelease
Nathan Lambert, a prominent AI expert, former Alignment Scientist at Hugging Face, and founder of the popular newsletter Interconnects, recently wrote about…
Open-R1：Hugging Face 推出完全開源的 DeepSeek-R1 重現計劃★ 90
Hugging Face Blog546 days agoRelease
### Project Background: Recreating the Open-Source Miracle of DeepSeek-R1 The emergence of DeepSeek-R1 sent shockwaves through the global AI community…
圖解人類回饋強化學習 (RLHF)：ChatGPT 背後的關鍵對齊技術★ 85
Hugging Face Blog1,327 days agoTutorial
The release of ChatGPT in late 2022 triggered an explosion in generative AI, and the most critical technology behind it is Reinforcement Learning from Human…
深入淺出近端策略優化 (PPO)：Hugging Face 深度強化學習教程★ 70
Hugging Face Blog1,453 days agoTutorial
Proximal Policy Optimization (PPO) is a deep reinforcement learning (DRL) algorithm proposed by OpenAI in 2017. Due to its ease of implementation, training…
Hugging Face 深度強化學習教程：Q-Learning 基礎入門（第一部分）
Hugging Face Blog1,532 days agoTutorial
This classic tutorial from Hugging Face is the first part of its "Deep Reinforcement Learning Course," designed to give readers a solid foundation in…
Hugging Face 深度強化學習（Deep RL）入門指南與核心概念解析★ 75
Hugging Face Blog1,546 days agoTutorial
This article is the introductory first chapter of the official Hugging Face "Deep Reinforcement Learning Course." With the widespread adoption of RLHF…