Hugging Face BlogMar 31, 2026, 12:00 AMimportant 85

Hugging Face 發表 TRL v1.0：專為後訓練（Post-Training）打造的開源庫，邁向 API 穩定與高效對齊新里程碑

Original: TRL v1.0: Post-Training Library Built to Move with the Field

Hugging Face has officially announced the release of TRL (Transformer Reinforcement Learning) v1.0. This is a major milestone, marking…

Hugging Face 旗下熱門的 Transformer 強化學習庫 TRL 正式迎來 v1.0 版本。此版本確立了穩定的 API 設計，並將定位聚焦於「後訓練（Post-Training）」生態系。TRL v1.0 整合了監督微調（SFT）、直接偏好優化（DPO）以及因 DeepSeek 爆紅的群體相對策略優化（GRPO）等主流對齊技術，旨在為開發者提供一個能與快速變革的 AI 領域並肩同行的標準化工具。

Hugging Face has officially announced the release of TRL (Transformer Reinforcement Learning) v1.0. This is a major milestone, marking TRL's transformation from an experimental research tool into a production-ready, API-stable core open-source library for post-training.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

open-source trl transformers #post-training #rlhf #dpo #grpo #fine-tuning #alignment

Summaries are AI-generated; the original article is authoritative.