Latest in AI

Showing:moeGeneralClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Exploring 2-bit QAT: Can Ultra-Compressed Large Models Outperform 4-bit Models Half Their Size?
r/LocalLLaMA top day50 days agoCommentary
A popular Reddit thread on r/LocalLLaMA discusses the potential of 2-bit Quantization Aware Training (QAT) for large MoE models (120B to 400B). While current QAT efforts focus on 4-bit, users speculate whether a 2-bit QAT model could fit into consumer hardware (64GB/128GB RAM) and outperform a 4-bit model of half its size. This approach is proposed as a practical alternative to training ternary (1.58-bit) LLMs from scratch.
Thinking Machines 推出原生互動模型 TML-Interaction-Small 276B-A12B：突破即時語音 SOTA 並淘汰傳統 VAD★ 85
Latent Space77 days agoRelease
According to AINews, the AI research team Thinking Machines (affectionately nicknamed "Team Thinky" by the community) has recently unveiled a new native…
中國開源 AI 生態系的架構抉擇：超越 DeepSeek 的下一步★ 85
Hugging Face Blog182 days agoCommentary
This blog post from Hugging Face reviews the full year of technical evolution since the "DeepSeek Moment" at the start of 2025 — the release of DeepSeek-V3 and…
「DeepSeek 時刻」一週年：開源 AI 的典範轉移與變革回顧★ 85
Hugging Face Blog189 days agoCommentary
The DeepSeek-V3 and R1 models released in January 2025 have been hailed as the "DeepSeek Moment" in the AI world. This upheaval not only shattered the myth…
歡迎來到 Falcon 3 開源模型家族！TII 推出全新輕量與 MoE 模型架構★ 80
Hugging Face Blog588 days agoRelease
The Technology Innovation Institute (TII) of Abu Dhabi has officially launched the new Falcon 3 open-source model family on Hugging Face. This marks a major…
2023 年：開源大語言模型（Open LLMs）爆發之年★ 75
Hugging Face Blog953 days agoCommentary
Looking back on 2023, the most notable trend in the AI landscape was the explosive growth of open-source large language models (Open LLMs). In this annual…
歡迎 Mixtral：Hugging Face 迎來頂尖的混合專家（MoE）開源模型★ 90
Hugging Face Blog960 days agoRelease
French AI startup Mistral AI officially released its highly anticipated open-source Mixture of Experts (MoE) model — Mixtral 8x7B. The model caused a sensation…