Latest in AI

Showing:text-generationResearchersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Google DeepMind Releases DiffusionGemma: Open Source Model with 4x Local AI Execution Speed Improvement
Ars Technica AI47 days agoRelease
Google DeepMind has released DiffusionGemma, an open-source model that brings diffusion-based generation to text tasks. Unlike autoregressive LLMs that generate one token at a time, diffusion models can produce outputs in parallel, dramatically cutting latency. The result is reportedly a 4x speed improvement for local AI inference, making on-device deployment significantly more practical.
NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI
NVIDIA Blog47 days agoRelease
Google DeepMind released DiffusionGemma, an experimental open model built for fast text generation. NVIDIA says it optimized the model for GeForce RTX GPUs, RTX PRO platforms, and DGX Spark systems. Instead of generating text one word at a time, DiffusionGemma produces multiple words in parallel to reduce latency for single-user workloads.
DiffusionGemma: 4x Faster Text Generation
r/LocalLLaMA top day47 days agoRelease
Google has announced DiffusionGemma, a text-generation model that applies diffusion-based techniques to the Gemma architecture, claiming speeds four times faster than standard autoregressive generation. Unlike conventional language models that predict tokens one at a time, diffusion-based methods generate text through iterative denoising, enabling parallel output. The release, published on Google's official blog, drew immediate attention from the local-LLM community for its potential inference-efficiency gains.
DiffusionGemma: The Developer Guide — Google Developers Blog
r/LocalLLaMA top day47 days agoTutorial
Google has released a comprehensive developer guide for DiffusionGemma, a text-generation model that uses masked diffusion rather than autoregressive next-token prediction. Unlike standard Gemma models, DiffusionGemma iteratively denoises a fully masked sequence to produce output, enabling a fundamentally different generation paradigm. The guide targets developers looking to integrate or experiment with diffusion-based LLMs using Google's tooling.
邁向光速文本生成：NVIDIA Nemotron-Labs 推出擴散語言模型 (Diffusion Language Models)★ 75
Hugging Face Blog66 days agoRelease
Traditional large language models (such as GPT, Claude, and others) all use an "autoregressive" mechanism — that is, they must predict the next token based on…
在 Intel® Gaudi® 2 AI 加速器上運行 Text-Generation Pipeline
Hugging Face Blog880 days agoRelease
With the explosive growth of large language models (LLMs), the demand for high-performance, cost-effective AI hardware has increased significantly. Intel Gaudi…
在 Transformers 中使用對比搜索（Contrastive Search）生成達到人類水準的文本 🤗★ 70
Hugging Face Blog1,358 days agoRelease
In the field of natural language generation (NLG), enabling language models to produce coherent and natural long-form text has long been a major challenge…
使用 TensorFlow 與 XLA 加速文本生成
Hugging Face Blog1,462 days agoTutorial
This Hugging Face technical blog post takes an in-depth look at how to use TensorFlow's XLA (Accelerated Linear Algebra) compiler to dramatically speed up the…
使用 Constrained Beam Search（受約束的束搜索）引導 🤗 Transformers 的文本生成★ 70
Hugging Face Blog1,600 days agoTutorial
In natural language generation (NLG) tasks, precisely controlling a model's output has always been a major challenge. Traditional decoding strategies like…
如何生成文本：在 Transformers 中使用不同的解碼方法進行語言生成★ 85
Hugging Face Blog2,340 days agoTutorial
This classic technical blog post written by Hugging Face takes an in-depth look at how to select and tune different "decoding methods" when performing…