Ars Technica AIJun 10, 2026, 7:29 PMRyan Whitwam

Google DeepMind Releases DiffusionGemma: Open Source Model with 4x Local AI Execution Speed Improvement

Original: Google DeepMind releases DiffusionGemma, a model that runs local AI 4x faster

Google DeepMind launches DiffusionGemma, applying diffusion architecture to text generation for 4x faster local inference.

Google DeepMind has released DiffusionGemma, an open-source model that brings diffusion-based generation to text tasks. Unlike autoregressive LLMs that generate one token at a time, diffusion models can produce outputs in parallel, dramatically cutting latency. The result is reportedly a 4x speed improvement for local AI inference, making on-device deployment significantly more practical.

Google DeepMind released DiffusionGemma in June 2026, the latest member of the Gemma open-source model family, whose core feature is to transplant and apply the "Diffusion Model" architecture, which has been widely used in the image generation field, to text language generation tasks.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Ars Technica AI →

Summaries are AI-generated; the original article is authoritative.