Latest in AI

Showing:ResearchersOpen-sourceClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Banning Open Source AI Would Be A Mistake
Interconnects (Nathan L.)38 days agoOpinion
In a collaborative op-ed written for a broad, non-technical readership, Interconnects author Nathan Lambert and Kevin Xu of Interconnected argue that banning open-source AI would be a policy error. The piece enters an active regulatory debate over whether unrestricted release of AI model weights poses unacceptable risks. By targeting a general audience, the authors seek to shape public opinion before legislative momentum solidifies.
Kaiming He's All-Undergrad Team Achieves Text-to-Image With Only 258M Parameters
量子位 QbitAI39 days agoPaper
A new research paper from Kaiming He's lab — notable for having an all-undergraduate team — demonstrates that high-quality text-to-image generation can be achieved with just 258 million parameters. This challenges the prevailing assumption that competitive image synthesis requires multi-billion-parameter models. The work signals a push toward leaner, more accessible generative vision architectures.
GLM-5.2 Passes Vibe Check; Z.ai Forecasts Open Fable by December
Latent Space39 days agoBenchmark
Zhipu AI's GLM-5.2 has passed broad informal community vibe checks, drawing favorable comparisons to GPT-class models and signaling a meaningful quality leap for open-weights AI. Z.ai, the company behind GLM, is additionally forecasting release of an open frontier-tier model — dubbed Open Fable — by December 2026. Together, these developments suggest open models are genuinely competing at the frontier rather than perpetually trailing closed proprietary systems.
The Professor of Outputmaxxing — Anjney Midha, AMP
Latent Space39 days agoBusiness
Latent Space interviews Anjney Midha, a prominent AI investor who has led funding rounds at Anthropic, Mistral, Black Forest Labs, and Periodic Labs. Midha shares his personal journey from humble beginnings in Singapore to becoming a key figure in AI venture capital. The conversation also surfaces what the podcast bills as "the AMP secret master plan," offering a rare look at the thesis behind his current venture.
Leanstral: Open-Source Foundation for Trustworthy Vibe-Coding
Mistral AI News40 days agoPaper
Mistral AI has introduced Leanstral, an open-source research project aimed at bringing formal trustworthiness to vibe-coding — the increasingly popular practice of generating software through natural-language AI prompts with minimal manual oversight. The initiative frames itself as a foundational layer, suggesting it is designed to underpin other tools or workflows rather than serve as a standalone end-user product. By releasing it as open-source, Mistral directly addresses one of vibe-coding's sharpest criticisms: that speed and accessibility come at the cost of correctness and verifiability.
Engineering: Heaps Do Lie — Debugging a Memory Leak in vLLM
Mistral AI News40 days agoTutorial
Mathis Felardos, a Mistral AI engineer, shares a technical deep-dive into tracking down a memory leak in vLLM, the widely adopted open-source LLM inference server. The investigation exposed a core frustration in systems debugging: heap profiling tools can actively mislead engineers rather than illuminate the true source of memory growth. The post offers practical engineering insight for teams operating LLM serving infrastructure in production.
France Advances Europe's AI Future With NVIDIA Technologies
NVIDIA Blog40 days agoBusiness
A year after France unveiled its national AI ambitions at NVIDIA GTC Paris during VivaTech, the infrastructure is moving from blueprint to reality. AI factories, national compute capacity, open frontier models, and industrial platforms are coming online. AI agents are now running in production, and French startups are actively deploying applications across the ecosystem.
Beyond LoRA: Can You Beat the Most Popular Fine-Tuning Technique?
Hugging Face Blog40 days agoBenchmark
Hugging Face's PEFT team benchmarks alternatives to LoRA — the dominant parameter-efficient fine-tuning method — asking whether newer techniques can match or surpass it in practice. The post evaluates candidates such as DoRA, LoRA+, AdaLoRA, and IA³ across task performance, memory footprint, and training speed within the unified PEFT library framework. Rather than declaring a single winner, the piece delivers a practical guide for choosing the right technique based on model size, task type, and resource constraints.
Is It Agentic Enough? Benchmarking Open Models on Your Own Tooling
Hugging Face Blog40 days agoBenchmark
Hugging Face published a guide examining whether open-weight models are sufficiently capable for agentic workflows when tested against custom tooling rather than standardized benchmarks. The piece challenges practitioners to move beyond generic leaderboard scores and assess agent performance in the context of their own use cases. It positions open models as viable candidates for production agentic pipelines, provided evaluation is grounded in realistic tool-use scenarios.
GLM-5.2 Takes the Top Spot Among Text-Only Open-Weights LLMs★ 72
Simon Willison's Weblog40 days agoRelease
Z.ai has released GLM-5.2, a 753B-parameter MIT-licensed open-weights model with a 1-million-token context window. Independent benchmark site Artificial Analysis ranks it first among open-weights models on their Intelligence Index v4.1, ahead of MiniMax-M3, DeepSeek V4 Pro, and Kimi K2.6. It also places second on Code Arena's WebDev leaderboard behind only Claude Fable 5, despite being text-only, and is available on OpenRouter at $1.40/$4.40 per million input/output tokens.
MolmoMotion: Language-Guided 3D Motion Forecasting
Hugging Face Blog40 days agoPaper
Allen Institute for AI has released MolmoMotion, a new model that adds language-guided 3D motion forecasting to the open-source Molmo family. By conditioning spatial trajectory predictions on natural language, the system enables more flexible, human-interpretable motion anticipation. The work targets applications in robotics, video understanding, and embodied AI where predicting movement in 3D space is safety-critical or operationally essential.
From Hugging Face Hub to Robot Hardware with Strands Agents and LeRobot
Hugging Face Blog41 days agoTutorial
A Hugging Face blog post co-authored with Amazon demonstrates how to take AI models from the Hugging Face Hub all the way to running on physical robots. The integration combines Amazon's open-source Strands Agents agentic framework with Hugging Face's LeRobot robotics library to create an end-to-end pipeline. The result is a practical path for developers to deploy Hub-trained policies and models onto real robot hardware using agent-based orchestration.
Zhipu's Open-Source GLM-5.2 Claims Top AI Coding Rank, Second Only to Fable-5
量子位 QbitAI41 days agoBenchmark
Zhipu AI has released GLM-5.2, an open-source large language model that has claimed the top position in AI coding benchmarks among all models except Anthropic's Fable-5. The result marks a significant milestone for the open-source community, showing that the gap between proprietary frontier models and open-source alternatives in code generation continues to shrink. For developers seeking capable, self-hostable coding models, GLM-5.2 now represents the strongest open-source option available.
GLM-5.2: Built for Long-Horizon Tasks
Hugging Face Blog41 days agoRelease
Zhipu AI has published GLM-5.2 on Hugging Face, framing the release around strong performance on long-horizon tasks — problems requiring sustained reasoning and planning across many dependent steps. The model continues the GLM lineage, one of China's most prominent open-source large-language-model families. By centering the announcement on long-horizon capability, Zhipu AI signals a strategic shift toward agentic and autonomous AI workflows rather than single-turn benchmark performance.
GLM-5.2: World's Top Open Frontend Coding Model + IndexShare Speculative Decoding
Latent Space41 days agoRelease
GLM-5.2 has claimed the leading position worldwide among open models on frontend coding benchmarks, marking a significant milestone for the open-source AI ecosystem. The release is accompanied by IndexShare, a new method targeting speculative decoding to improve inference throughput and reduce serving latency. Together, the two developments advance both capability and deployment efficiency for teams building with open models.
Agentic Resource Discovery: Let Agents Search
Hugging Face Blog41 days agoNew Tool
Hugging Face has introduced Agentic Resource Discovery, a capability enabling AI agents to search for and retrieve models, datasets, and other resources from the Hub dynamically. The feature targets a core friction point in agentic pipeline design, where resources are typically hardcoded by developers ahead of time. By enabling runtime resource lookup, it pushes Hugging Face Hub from a static asset store toward an active participant in agent architectures.
Ask HN: Has Anyone Replaced Claude/GPT with a Local Model for Daily Coding?
Hacker News (AI keywords)42 days agoCommentary
A Hacker News community thread poses the question of whether developers have successfully migrated their daily coding workflows away from commercial frontier models like Claude and GPT to locally-run alternatives. The post invites practitioners to share real-world experience with self-hosted or locally deployed language models as coding assistants. It surfaces a growing tension between cost, privacy, and latency offered by local models versus the raw capability of cloud-hosted frontier systems.
Noiz AI, HKUST & Tsinghua Open-Source Audio Generation Model: 4 Steps, 0.24s on One GPU
量子位 QbitAI43 days agoPaper
Noiz AI has partnered with Hong Kong University of Science and Technology (HKUST) and Tsinghua University to open-source a large audio generation model. The model's standout claims are efficiency: just four sampling steps to produce audio, with inference completing in 0.24 seconds on a single GPU. The open-source release brings research-grade, low-latency audio synthesis within reach of developers and researchers globally.
Rio de Janeiro's 'Homegrown' LLM Appears to Be a Merge of an Existing Model
Hacker News (AI keywords)43 days agoIncident
Rio de Janeiro's publicly promoted 'homegrown' large language model has come under scrutiny after investigators identified it as a merge of an already-existing model rather than an original creation. The finding, surfaced via a GitHub issue, raises questions about transparency in government-backed AI initiatives. If confirmed, the case highlights broader risks of model provenance misrepresentation as institutions race to claim local AI credentials.
The $1,500-Trained HRM Model Backed by HuggingFace CEO and Bengio's Team
量子位 QbitAI44 days agoPaper
A newly surfaced HRM model trained at the strikingly low cost of $1,500 has gone viral in AI circles after drawing strong recommendations from HuggingFace CEO Clem Delangue and backing from a team affiliated with Turing Award laureate Yoshua Bengio. The story underscores a growing industry fascination with cost-efficient AI training. Its rapid spread signals that the community sees it as evidence that meaningful model development no longer requires million-dollar compute budgets.
AINews: Fable and Mythos Access Suspended Over Cybersecurity Risk★ 76
Latent Space45 days agoIncident
Anthropic’s Claude Fable 5 and Mythos 5 were abruptly suspended after a US export-control directive tied to a possible jailbreak and national cybersecurity risk. The roundup frames the event as a new “model sovereignty” warning for teams relying on closed frontier APIs. It also covers Kimi-K2.7-Code, MiniMax M3, DeepSWE replacing SWE-Bench Pro, agent-inference benchmarks, sandboxing, and Gemini-SQL2.
Open Source AI Must Win
Hacker News (AI keywords)45 days agoOpinion
With no article body provided, the only supported reading is that this is an opinion piece advocating for open source AI. The title frames open source AI not merely as one option among many, but as something that “must win.” It likely targets readers interested in AI governance, developer ecosystems, model access, and competition, but no specific claims or evidence are available.
olmo-eval: An Evaluation Workbench for the Model Development Loop
Hugging Face Blog45 days agoNew Tool
The Hugging Face Blog post announces olmo-eval, described as an evaluation workbench for the model development loop. Based on the title alone, the project appears focused on helping teams evaluate models during iterative development rather than only after release. No article body was provided, so specific features, supported benchmarks, integrations, metrics, or usage details cannot be confirmed.
Open Reproduction of DeepSeek-R1
Hacker News (AI keywords)46 days agoRelease
The linked item is a GitHub project titled “Open Reproduction of DeepSeek-R1,” with no article body provided. From the title alone, it appears to be an effort to recreate or document DeepSeek-R1 in an open manner. The main relevance is for researchers and ML engineers interested in reproducible reasoning-model training, evaluation, and open-source alternatives.
Silia: A Tiny Transformer Architecture for Sub-10M Parameter Models
r/LocalLLaMA top day47 days agoPaper
A student from India shared their first paper on r/LocalLLaMA, proposing Silia, a Transformer architecture for extremely small models. The idea is to merge attention-style dynamic mixing with SwiGLU-like nonlinear transformation, aiming to save parameters in models under roughly 10M parameters. The author frames the work as an early, small-scale exploration, limited by old hardware and restricted access to larger compute.
NVIDIA Releases NVFP4-Quantized DiffusionGemma 26B A4B IT on Hugging Face
r/LocalLLaMA top day47 days agoRelease
NVIDIA has released DiffusionGemma 26B A4B IT NVFP4 on Hugging Face, a quantized version of Google DeepMind's open-weights multimodal model. Built on a Mixture-of-Experts architecture with 25.2B total but only 3.8B active parameters, it generates text in parallel 256-token blocks using discrete diffusion, exceeding 1,100 tokens per second on H100 hardware. The model supports a 256K-token context, text/image/video inputs, native function calling, reasoning mode, and 35+ languages.
[AINews] Open Models, Model Labs vs Agent Labs, and the Untrainable★ 72
Latent Space47 days agoCommentary
This AINews issue uses Sarah Guo’s essay as a lens for current AI industry debates: where open models matter, how agent labs differ from model labs, and what cannot be trained away. It also recaps discourse around Anthropic Fable/Mythos, Fable 5’s capabilities, Google’s DiffusionGemma, and maturing agent infrastructure. The central takeaway is that durable value may lie in integration, customer translation, maintenance, and intent rather than model scores alone.
Offline CPU Voice Loop for Ollama and LM Studio Agents
r/LocalLLaMA top day47 days agoNew Tool
A r/LocalLLaMA post introduces an offline voice loop for talking to local models through Ollama, LM Studio, or vLLM. The stack uses Silero VAD, Parakeet TDT 0.6B v3 STT, and Supertonic TTS 3, all running on CPU so GPU memory stays available for the LLM. The author reports measured CPU-only benchmarks, agent integrations, cross-platform installers, and an MIT-licensed GitHub release.
AMD Highlights Unified Memory Architecture for Future AI Systems
r/LocalLLaMA top day47 days agoHardware
A Reddit post in r/LocalLLaMA links to coverage of AMD discussing unified memory architecture and its role in future product roadmaps. The post says AMD believes UMA could help shape next-generation architectures and notes Ryzen AI MAX 400 series systems, also referred to by the community as Gorgon Halo. It frames the topic as part of an ongoing LocalLLaMA discussion about whether unified-memory x86 systems could matter for local AI workloads.
qwen3.6-27b Users Report Repeated Tool Call Loops
r/LocalLLaMA top day47 days agoIncident
A Reddit user on r/LocalLLaMA says qwen3.6-27b can fall into repeated tool-call loops during use. They report spending two days adjusting parameters such as temperature and top-k without resolving the issue. The post is a troubleshooting question rather than a confirmed bug report, asking whether other local model users have seen similar behavior.

Page 1Next →

Latest in AI

Banning Open Source AI Would Be A Mistake

Kaiming He's All-Undergrad Team Achieves Text-to-Image With Only 258M Parameters

GLM-5.2 Passes Vibe Check; Z.ai Forecasts Open Fable by December

The Professor of Outputmaxxing — Anjney Midha, AMP

Leanstral: Open-Source Foundation for Trustworthy Vibe-Coding

Engineering: Heaps Do Lie — Debugging a Memory Leak in vLLM

France Advances Europe's AI Future With NVIDIA Technologies

Beyond LoRA: Can You Beat the Most Popular Fine-Tuning Technique?

Is It Agentic Enough? Benchmarking Open Models on Your Own Tooling

GLM-5.2 Takes the Top Spot Among Text-Only Open-Weights LLMs★ 72

MolmoMotion: Language-Guided 3D Motion Forecasting

From Hugging Face Hub to Robot Hardware with Strands Agents and LeRobot

Zhipu's Open-Source GLM-5.2 Claims Top AI Coding Rank, Second Only to Fable-5

GLM-5.2: Built for Long-Horizon Tasks

GLM-5.2: World's Top Open Frontend Coding Model + IndexShare Speculative Decoding

Agentic Resource Discovery: Let Agents Search

Ask HN: Has Anyone Replaced Claude/GPT with a Local Model for Daily Coding?

Noiz AI, HKUST & Tsinghua Open-Source Audio Generation Model: 4 Steps, 0.24s on One GPU

Rio de Janeiro's 'Homegrown' LLM Appears to Be a Merge of an Existing Model

The $1,500-Trained HRM Model Backed by HuggingFace CEO and Bengio's Team

AINews: Fable and Mythos Access Suspended Over Cybersecurity Risk★ 76

Open Source AI Must Win

olmo-eval: An Evaluation Workbench for the Model Development Loop

Open Reproduction of DeepSeek-R1

Silia: A Tiny Transformer Architecture for Sub-10M Parameter Models

NVIDIA Releases NVFP4-Quantized DiffusionGemma 26B A4B IT on Hugging Face

[AINews] Open Models, Model Labs vs Agent Labs, and the Untrainable★ 72

Offline CPU Voice Loop for Ollama and LM Studio Agents

AMD Highlights Unified Memory Architecture for Future AI Systems

qwen3.6-27b Users Report Repeated Tool Call Loops