Latest in AI

Showing:throughputDevelopersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Hands-On Test of Xiaomi’s Fastest 1T Model: 1,000+ Tokens/s and 7s Vibe Coding
量子位 QbitAI47 days agoBenchmark
QbitAI’s title describes a hands-on evaluation of Xiaomi’s fastest 1T large model. The highlighted claim is performance: throughput above 1,000 tokens per second. It also frames the model around coding productivity, saying a Vibe Coding task was delivered in seven seconds, though no article body is available to verify methodology, task scope, model name, pricing, or benchmark conditions.
解鎖連續批次處理（Continuous Batching）中的非同步機制★ 75
Hugging Face Blog75 days agoRelease
As the demand for deploying large language models (LLMs) in production environments surges, how to improve inference efficiency and reduce costs has become a…
讓 Token 持續流動：來自 16 個開源強化學習（RL）函式庫的啟示★ 85
Hugging Face Blog140 days agoCommentary
With the success of reasoning models such as DeepSeek-R1, reinforcement learning (RL/RLHF) has become a critical technique for improving the alignment and…
評測 Text Generation Inference (TGI)：如何量化與優化大語言模型推理性能★ 75
Hugging Face Blog790 days agoTutorial
This official Hugging Face blog post takes an in-depth look at how to benchmark Text Generation Inference (TGI), Hugging Face's open-source LLM inference and…