Latest in AI

Showing:reasoningDevelopersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

讓大型模型展開辯論：首屆多語言 LLM 辯論賽★ 75
Hugging Face Blog615 days agoRelease
This article from the Hugging Face blog introduces "The First Multilingual LLM Debate Competition." As large language models (LLMs) have rapidly advanced…
NuminaMath 如何贏得首屆 AIMO 進步獎（AI 數學奧林匹亞）並宣佈完整開源★ 80
Hugging Face Blog747 days agoRelease
### Background and Achievement The AI Mathematical Olympiad (AIMO) Progress Prize aims to advance AI models capable of solving Olympiad-level mathematical…
Hugging Face 推出 Open Chain of Thought (CoT) 排行榜：專注評估開源模型的推理與思考鏈能力★ 75
Hugging Face Blog826 days agoRelease
Hugging Face has announced the launch of the new "Open Chain of Thought (CoT) Leaderboard," a public platform specifically designed to evaluate and compare the…
Hugging Face 推出 ConTextual 排行榜：評估多模態模型在富含文本場景中的圖文聯合推理能力★ 75
Hugging Face Blog875 days agoRelease
Hugging Face has announced the launch of a new multimodal benchmark and leaderboard called "ConTextual," aimed at addressing the shortcomings of existing…
Hugging Face 推出 NPHardEval 排行榜：透過計算複雜度與動態更新揭示大型語言模型的推理能力★ 75
Hugging Face Blog907 days agoRelease
Hugging Face has announced the launch of the new **NPHardEval** leaderboard — a benchmark specifically designed to evaluate the reasoning capabilities of large…

← PreviousPage 3