Latest in AI

Showing:llm-as-a-judgeResearchersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Hugging Face 與 Atla 推出「Judge Arena」：評估 LLM 作為裁判能力的全新基準測試★ 80
Hugging Face Blog616 days agoRelease
As large language models (LLMs) have rapidly advanced, traditional static benchmarks (such as MMLU) have increasingly faced saturation and gaming problems. As…
專家支援案例研究：利用 LLM-as-a-Judge 評估機制強化 Digital Green 的 RAG 農業問答應用★ 75
Hugging Face Blog638 days agoTutorial
This case study provides a detailed account of how non-profit organization Digital Green, with support from Hugging Face's Expert Support team, optimized its…
LAVE：在 Docmatix 上使用 LLM 進行零樣本 VQA 評估——我們還需要微調嗎？★ 75
Hugging Face Blog733 days agoPaper
### Background and Challenges Document Visual Question Answering (DocVQA) is an important application of multimodal AI, requiring models to simultaneously…
如何利用 distilabel 打造 Argilla 2.0 專屬聊天機器人★ 75
Hugging Face Blog742 days agoTutorial
In the AI field, quickly building a chatbot that can accurately answer questions about a specific domain or newly released software has always been a major…
基座模型能像人類一樣標記數據嗎？Hugging Face 探討 AI 標記與 RLHF 的可行性★ 75
Hugging Face Blog1,142 days agoCommentary
In the development of large language models (LLMs), RLHF (Reinforcement Learning from Human Feedback) is the critical step for aligning models with human…