DiffusionGemma: Google Launches High-Speed Open-Weight Gemma Diffusion Model★ 76
Simon Willison's Weblog·14 hours ago·Release
Simon Willison highlights Google’s new DiffusionGemma, an Apache 2 licensed open-weight Gemma model. He connects it to last year’s brief Gemini Diffusion preview, which he measured at 857 tokens per second. NVIDIA is currently hosting the model for free on its NIM cloud API, where Willison generated 2,409 tokens in 4.4 seconds, implying at least 500 tokens per second.