Latest in AI

Showing:open-weightsDevelopersClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

GLM-5.2 Takes the Top Spot Among Text-Only Open-Weights LLMs★ 72
Simon Willison's Weblog40 days agoRelease
Z.ai has released GLM-5.2, a 753B-parameter MIT-licensed open-weights model with a 1-million-token context window. Independent benchmark site Artificial Analysis ranks it first among open-weights models on their Intelligence Index v4.1, ahead of MiniMax-M3, DeepSeek V4 Pro, and Kimi K2.6. It also places second on Code Arena's WebDev leaderboard behind only Claude Fable 5, despite being text-only, and is available on OpenRouter at $1.40/$4.40 per million input/output tokens.
GLM-5.2 Claims Top Open-Weights Spot on Artificial Analysis Intelligence Index
Hacker News (AI keywords)41 days agoBenchmark
GLM-5.2, the latest open-weights model from Zhipu AI, has claimed the top position on the Artificial Analysis Intelligence Index among all openly available models. This marks a notable shift in the open-weights leaderboard, which tracks quality, speed, and price across dozens of frontier and community models. The result signals continued momentum from Chinese AI labs producing competitive open-weights alternatives to proprietary frontier systems.
Avataar AI Launches Low-Cost Varya Video Model for India
TechCrunch AI46 days agoRelease
Avataar AI has launched Varya, a video generation model built from Alibaba’s open Wan 2.2 model and distilled for faster, cheaper output. The company says Varya can generate 5-second 720p clips on an NVIDIA H200 in 45 seconds, versus 1,230 seconds for Wan 2.2. Avataar plans to release the model and training data through India’s AI Kosh portal while offering hosted access at about $0.005 per second.
NVIDIA Releases NVFP4-Quantized DiffusionGemma 26B A4B IT on Hugging Face
r/LocalLLaMA top day47 days agoRelease
NVIDIA has released DiffusionGemma 26B A4B IT NVFP4 on Hugging Face, a quantized version of Google DeepMind's open-weights multimodal model. Built on a Mixture-of-Experts architecture with 25.2B total but only 3.8B active parameters, it generates text in parallel 256-token blocks using discrete diffusion, exceeding 1,100 tokens per second on H100 hardware. The model supports a 256K-token context, text/image/video inputs, native function calling, reasoning mode, and 35+ languages.
DiffusionGemma: Google Launches High-Speed Open-Weight Gemma Diffusion Model★ 76
Simon Willison's Weblog47 days agoRelease
Simon Willison highlights Google’s new DiffusionGemma, an Apache 2 licensed open-weight Gemma model. He connects it to last year’s brief Gemini Diffusion preview, which he measured at 857 tokens per second. NVIDIA is currently hosting the model for free on its NIM cloud API, where Willison generated 2,409 tokens in 4.4 seconds, implying at least 500 tokens per second.
DiffusionGemma: 4x Faster Text Generation
r/LocalLLaMA top day47 days agoRelease
Google has announced DiffusionGemma, a text-generation model that applies diffusion-based techniques to the Gemma architecture, claiming speeds four times faster than standard autoregressive generation. Unlike conventional language models that predict tokens one at a time, diffusion-based methods generate text through iterative denoising, enabling parallel output. The release, published on Google's official blog, drew immediate attention from the local-LLM community for its potential inference-efficiency gains.
MooreThreads Releases MusaCoder-27B Code LLM on Hugging Face
r/LocalLLaMA top day48 days agoRelease
MooreThreads, a Chinese GPU semiconductor company best known for its MUSA compute platform, has released MusaCoder-27B on Hugging Face alongside a technical paper on arXiv. The 27B-parameter model is positioned as a code-generation LLM, extending MooreThreads' ambitions beyond hardware into the AI model layer. Its public availability on Hugging Face signals an open-weights approach, making it accessible to local-inference practitioners and researchers evaluating alternatives to Western-origin coding models.
Anthropic Is Accused of Nerfing Fable for Other LLM Development
r/LocalLLaMA top day48 days agoCommentary
A r/LocalLLaMA post claims Anthropic may be intentionally limiting Fable when users ask it to help build other LLMs. The source is a short Reddit post with screenshot context, not a formal benchmark or verified disclosure. Discussion centers on trust in hosted closed models, unclear safety boundaries, and why local or open-weight LLMs may be necessary for serious AI development work.
Releasing Apodex-1.0 Smol Models (0.8B, 2B, 4B Open-Weights) Optimized for Agentic Verification + AgentHarness Evals
r/LocalLLaMA top day48 days agoRelease
Apodex 1.0 launches with open-weight models at 0.8B, 2B, and 4B, trained not for general generation but for specialized sub-agent roles—fact-checking external claims and verifying tool call outputs before passing results to a main controller. The design targets long-horizon agent workflows where routing small tasks to lightweight models avoids wasteful use of 70B+ models at every step. AgentHarness, an open-source evaluation framework for local multi-step agent pipelines, is released alongside the weights.
Omi Med STT v1: Open-Weight Medical ASR Fine-Tuned from Parakeet 0.6B★ 72
r/LocalLLaMA top day49 days agoRelease
Omi Health’s founder says he fine-tuned NVIDIA Parakeet TDT 0.6B v2 for clinical speech and released Omi Med STT v1 under CC-BY-4.0. The runtime supports Mac, Windows, and Linux, auto-selecting MLX, NeMo, or GGUF/parakeet.cpp backends. In the author’s held-out medical benchmark, it reports 2.37% medical-WER and 145× realtime on local A10 compute.
Was BitNet a dead end? What happened to ternary LLMs?
r/LocalLLaMA top day49 days agoCommentary
A r/LocalLLaMA user questions whether BitNet and ternary LLMs were a dead end after earlier promise around efficient low-bit models. The post notes that the largest ternary model appears to remain around 2B parameters. It asks why frontier open-weight AI labs are not visibly pursuing the approach, but provides no technical evidence or definitive answer.
Cohere's Commitment to Open Science and Collaborative AI Research
Cohere Blog50 days agoCommentary
Cohere's Open Science initiative, primarily driven by its non-profit research lab Cohere For AI (C4AI), focuses on democratizing AI research. By releasing open-weights models like Aya and fostering global research collaborations, Cohere aims to bridge the gap in multilingual AI representation. This approach highlights their commitment to community-driven, accessible AI development.
Magistral★ 78
Mistral AI News50 days agoRelease
Mistral AI announced Magistral, its first reasoning model family, with Magistral Small as a 24B open-weight Apache 2.0 model and Magistral Medium for enterprise use. The company emphasizes traceable multilingual reasoning, professional-domain use cases, and faster reasoning in Le Chat through Think mode and Flash Answers. Magistral Small is available on Hugging Face, while Magistral Medium is available in Le Chat preview and via La Plateforme API.
Introducing Mistral 3★ 84
Mistral AI News50 days agoRelease
Mistral AI introduced Mistral 3, a new open model family under Apache 2.0. It includes Mistral Large 3, a 675B-parameter sparse MoE with 41B active parameters, plus Ministral 3 models at 3B, 8B, and 14B. The release targets frontier open-weight use, multimodal and multilingual workflows, enterprise customization, and efficient local or edge deployments.
Introducing Devstral 2 and Mistral Vibe CLI★ 76
Mistral AI News50 days agoNew Tool
Mistral introduced Devstral 2, a 123B coding model, and Devstral Small 2, a 24B variant for lighter deployment. The company reports 72.2% and 68.0% on SWE-bench Verified, respectively, with permissive open-source licensing. It also launched Mistral Vibe CLI, an open-source terminal agent for codebase exploration, multi-file edits, command execution, and IDE integration.
Leanstral: Open-Source Foundation for Trustworthy Vibe-Coding★ 76
Mistral AI News50 days agoRelease
Mistral AI introduced Leanstral, an open-source code agent designed for Lean 4 and formal proof engineering. The model is available through Apache 2.0 weights, Mistral Vibe, and a Labs API endpoint. Mistral positions it as a cost-efficient alternative for verified coding workflows, with FLTEval benchmarks comparing it against Claude family models and large open-source competitors.
Introducing Mistral Small 4★ 76
Mistral AI News50 days agoRelease
Mistral AI introduced Mistral Small 4 as the next major release in the Mistral Small family. It combines reasoning, multimodal, and agentic coding capabilities into one open model with configurable reasoning effort. The model uses a MoE architecture, supports a 256k context window and text-image inputs, and is available through Mistral API, AI Studio, Hugging Face, NVIDIA NIM, and common inference stacks.
Voxtral TTS: Open-Weights, Low-Latency Text-to-Speech from Mistral AI★ 78
Mistral AI News50 days agoRelease
Mistral AI introduced Voxtral TTS, its first text-to-speech model, focused on realistic multilingual voice generation. The 4B-parameter model supports nine languages, quick voice adaptation from short references, and low-latency streaming for voice agents. Mistral says human evaluations show stronger naturalness than ElevenLabs Flash v2.5, with API access, Studio testing, Le Chat access, and open weights on Hugging Face.
Remote agents in Vibe, powered by Mistral Medium 3.5★ 78
Mistral AI News50 days agoNew Tool
Mistral Medium 3.5 is a 128B dense model in public preview, combining instruction-following, reasoning, and coding with a 256k context window. It becomes the default model for Le Chat and Mistral Vibe. Vibe now supports remote coding agents that run asynchronously in the cloud, while Le Chat adds Work mode for longer multi-step tasks across connected tools.
Introducing Mistral 3★ 78
Mistral AI News50 days agoRelease
Mistral AI introduced Mistral 3, a new open model family including Mistral Large 3 and Ministral 3 models at 3B, 8B, and 14B sizes. Large 3 is a 675B-parameter sparse MoE model with 41B active parameters, while Ministral 3 targets local and edge use cases. The models are released under Apache 2.0 and are available through Mistral AI Studio, Hugging Face, Amazon Bedrock, and other platforms.
Introducing Mistral Small 4★ 78
Mistral AI News50 days agoRelease
Mistral Small 4 is the next major release in the Mistral Small family, unifying Magistral-style reasoning, Pixtral-style multimodality, and Devstral-style coding agents. It uses a MoE architecture with 119B total parameters, 6B active parameters per token, a 256k context window, and configurable reasoning effort. The model is available via Mistral API, AI Studio, Hugging Face, open-source serving stacks, and NVIDIA deployment options.
Remote agents in Vibe. Powered by Mistral Medium 3.5.★ 76
Mistral AI News50 days agoRelease
Mistral Medium 3.5 is a 128B dense flagship model with a 256k context window, combining instruction-following, reasoning, and coding. It becomes the default model for Le Chat and Mistral Vibe, enabling cloud-based remote coding agents launched from the CLI or chat. The release also adds Le Chat Work mode for multi-step, cross-tool workflows with visible actions and approval gates for sensitive operations.
Magenta RealTime 2: An Open, Locally Runnable Real-Time Music Model★ 74
Hacker News (AI keywords)53 days agoRelease
Magenta RealTime 2 is an open-weights live music model designed for interactive performance rather than offline prompt-to-song generation. It supports real-time control through MIDI, audio, and text, and can run as standalone apps, DAW plugins, or embedded music software. Google Magenta also released a Python library, C++ MLX inference engine, models, and example applications for musicians and developers.
Google 發表 Gemma 4：專為裝置端設計的前沿多模態開放模型★ 85
Hugging Face Blog117 days agoRelease
Google and Hugging Face have jointly announced a new generation of open-weight models — "Gemma 4." This model represents a major breakthrough in on-device AI…
最新開放模型彙整 (#19)：Qwen 3.5、GLM 5、MiniMax 2.5 —— 中國實驗室的最新前沿突破★ 82
Interconnects (Nathan L.)146 days agoRelease
This is Issue #19 of the "Latest Open Artifacts" column by well-known AI industry analyst Nathan Lambert, opening with "Welcome to the year of the horse!" It…
Google DeepMind 推出 MedGemma：用於醫療 AI 開發的最強大開源多模態模型★ 82
Google DeepMind Blog275 days agoRelease
Google DeepMind officially announced the launch of a new "MedGemma" multimodal model within its open-source medical model series. This model represents the…
Google 推出 Gemma 3n：專為開發者社群打造的全新指南★ 70
Google DeepMind Blog275 days agoRelease
Google DeepMind officially launched Gemma 3n along with its developer guide. The Gemma series, as Google's open-weights model family, has long been a favorite…
IBM Granite 4.0 模型正式上架 Replicate 雲端平台
Replicate Blog299 days agoRelease
Cloud AI model hosting platform Replicate has announced official support for IBM's latest Granite 4.0 model family. This means developers and enterprise users…
介紹 Palmyra-mini 系列：強大、輕量且具備推理能力的全新模型！★ 72
Hugging Face Blog319 days agoRelease
Writer, a leading provider of enterprise AI solutions, has officially announced the launch of its new "Palmyra-mini" model series on the Hugging Face platform…
Google 發表 Gemma 3n 預覽版：強大、高效且行動優先的端側多模態 AI 模型★ 78
Google DeepMind Blog434 days agoRelease
Google DeepMind has officially released a preview of its new open model "Gemma 3n." This is a cutting-edge open model purpose-built for mobile devices and…

Page 1Next →

Latest in AI

GLM-5.2 Takes the Top Spot Among Text-Only Open-Weights LLMs★ 72

GLM-5.2 Claims Top Open-Weights Spot on Artificial Analysis Intelligence Index

Avataar AI Launches Low-Cost Varya Video Model for India

NVIDIA Releases NVFP4-Quantized DiffusionGemma 26B A4B IT on Hugging Face

DiffusionGemma: Google Launches High-Speed Open-Weight Gemma Diffusion Model★ 76

DiffusionGemma: 4x Faster Text Generation

MooreThreads Releases MusaCoder-27B Code LLM on Hugging Face

Anthropic Is Accused of Nerfing Fable for Other LLM Development

Releasing Apodex-1.0 Smol Models (0.8B, 2B, 4B Open-Weights) Optimized for Agentic Verification + AgentHarness Evals

Omi Med STT v1: Open-Weight Medical ASR Fine-Tuned from Parakeet 0.6B★ 72

Was BitNet a dead end? What happened to ternary LLMs?

Cohere's Commitment to Open Science and Collaborative AI Research

Magistral★ 78

Introducing Mistral 3★ 84

Introducing Devstral 2 and Mistral Vibe CLI★ 76

Leanstral: Open-Source Foundation for Trustworthy Vibe-Coding★ 76

Introducing Mistral Small 4★ 76

Voxtral TTS: Open-Weights, Low-Latency Text-to-Speech from Mistral AI★ 78

Remote agents in Vibe, powered by Mistral Medium 3.5★ 78

Introducing Mistral 3★ 78

Introducing Mistral Small 4★ 78

Remote agents in Vibe. Powered by Mistral Medium 3.5.★ 76

Magenta RealTime 2: An Open, Locally Runnable Real-Time Music Model★ 74

Google 發表 Gemma 4：專為裝置端設計的前沿多模態開放模型★ 85

最新開放模型彙整 (#19)：Qwen 3.5、GLM 5、MiniMax 2.5 —— 中國實驗室的最新前沿突破★ 82

Google DeepMind 推出 MedGemma：用於醫療 AI 開發的最強大開源多模態模型★ 82

Google 推出 Gemma 3n：專為開發者社群打造的全新指南★ 70

IBM Granite 4.0 模型正式上架 Replicate 雲端平台

介紹 Palmyra-mini 系列：強大、輕量且具備推理能力的全新模型！★ 72

Google 發表 Gemma 3n 預覽版：強大、高效且行動優先的端側多模態 AI 模型★ 78