Latest in AI

Showing:text-to-imageResearchersClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Kaiming He's All-Undergrad Team Achieves Text-to-Image With Only 258M Parameters
量子位 QbitAI39 days agoPaper
A new research paper from Kaiming He's lab — notable for having an all-undergraduate team — demonstrates that high-quality text-to-image generation can be achieved with just 258 million parameters. This challenges the prevailing assumption that competitive image synthesis requires multi-billion-parameter models. The work signals a push toward leaner, more accessible generative vision architectures.
HiDream-O1-Image-1.5 Ranks #1 in China, #2 Globally in Text-to-Image Benchmarks, Surpassing Google and NVIDIA
量子位 QbitAI47 days agoBenchmark
HiDream-O1-Image-1.5, a Chinese text-to-image model, has reached the top of domestic leaderboards and secured second place globally in the latest benchmark standings. The model reportedly outperforms image-generation offerings from Google and NVIDIA. The result marks a significant milestone for Chinese generative image research on the world stage.
PRX 第三部分：在 24 小時內訓練一個 Text-to-Image 圖像生成模型！★ 75
Hugging Face Blog146 days agoTutorial
Photoroom, the well-known AI image editing tool, recently published Part 3 of its technical blog series on Hugging Face about its in-house image generation…
Text-to-Image 模型訓練設計：來自 Photoroom 消融實驗的實戰啟示★ 75
Hugging Face Blog175 days agoTutorial
Photoroom, the well-known image editing platform, recently published a series of technical blog posts about their in-house text-to-image model, PRX. In Part 2…
Diffusers 正式支援 FLUX-2：下一代開源圖像生成模型降臨★ 85
Hugging Face Blog245 days agoRelease
The Hugging Face official blog has announced that the popular diffusion model library `diffusers` now officially supports FLUX-2, the next-generation…
Google DeepMind 發表 Gemini 3 Pro 圖像模型「Nano Banana Pro」：開啟下一代視覺生成與構建★ 78
Google DeepMind Blog250 days agoRelease
Google DeepMind has unveiled a new model called "Nano Banana Pro," which is also the Pro-tier image model of the Gemini 3 generation (Gemini 3 Pro Image…
Hugging Face 社群推出用於文字生成圖像的開源偏好資料集 (Open Preference Dataset)★ 75
Hugging Face Blog596 days agoRelease
### Introduction: An Important Piece of the Open-Source Image Generation Puzzle As text-to-image (T2I) technology advances rapidly, ensuring that AI-generated…
Diffusers 正式支援 Stable Diffusion 3.5 Large：更強大的開源圖像生成模型與優化指南★ 85
Hugging Face Blog644 days agoRelease
Stability AI officially launched the Stable Diffusion 3.5 (SD3.5) model series in late October 2024, and Hugging Face's Diffusers team simultaneously announced…
Black Forest Labs 推出全新開源影像生成模型 FLUX.1：Replicate 已支援 API 運行★ 90
Replicate Blog726 days agoRelease
Black Forest Labs — a new AI team founded by the original creators of Stable Diffusion (including core developer Robin Rombach and others) — has officially…
Replicate Intelligence #6：Google Gemma 2 模型上線、LLM 排行榜更新與 Stable Diffusion 3 實用技巧★ 75
Replicate Blog760 days agoRelease
This issue of Replicate Intelligence summarizes three major core updates from the recent open-source AI landscape: 1. **Google Gemma 2 officially launches**…
Diffusers 正式支援 Stable Diffusion 3：更強大的圖像生成與記憶體優化★ 80
Hugging Face Blog776 days agoRelease
Hugging Face's official blog announced that its diffusers library now officially supports Stable Diffusion 3 (SD3), the latest release from Stability AI. SD3…
迎來 aMUSEd：高效的輕量級 Text-to-Image 文本生成圖像模型
Hugging Face Blog936 days agoRelease
The Hugging Face official blog formally introduced a brand-new open-source text-to-image model called "aMUSEd." This model is based on a reproduction and…
使用 TRL 透過 DDPO 微調 Stable Diffusion 模型★ 75
Hugging Face Blog1,033 days agoRelease
Hugging Face published a blog post introducing how to use the DDPO (Denoising Diffusion Policy Optimization) algorithm within the TRL (Transformer…
介紹 Würstchen：超快速且高效的圖像生成擴散模型★ 75
Hugging Face Blog1,049 days agoRelease
Hugging Face, in collaboration with the research community, has introduced a new text-to-image diffusion model called "Würstchen." The model's standout feature…
畫筆化為文字：文字生成圖像 AI 的演進簡史
Replicate Blog1,071 days agoCommentary
On the occasion of the first anniversary of Stable Diffusion and Replicate's launch of Stable Diffusion XL (SDXL) fine-tuning services, this article provides…
Hugging Face 倫理與社會電子報 #4：文字生成圖像模型中的偏見問題
Hugging Face Blog1,128 days agoCommentary
The Hugging Face Ethics and Society team has published the fourth edition of its newsletter, this time focusing on the problem of "bias" in text-to-image (T2I)…
在免費版 Google Colab 上使用 🧨 diffusers 運行 DeepFloyd IF 模型
Hugging Face Blog1,189 days agoTutorial
### Core Background and Challenges DeepFloyd IF is an advanced text-to-image model released by DeepFloyd, a research lab under Stability AI. Unlike the…
VQ-Diffusion：基於離散擴散模型的文本到圖像生成技術
Hugging Face Blog1,336 days agoRelease
In late 2022, while continuous-space diffusion models represented by Stable Diffusion were stealing the spotlight, diffusion models operating in discrete space…
Stability AI 發布 Japanese Stable Diffusion：專為日語優化的文字生成圖像模型
Hugging Face Blog1,392 days agoRelease
In October 2022, Stability AI officially released "Japanese Stable Diffusion," a model specifically designed for the Japanese market and culture, hosted on the…
探索文字生成圖片模型：使用 Replicate API 輕鬆創作影像
Replicate Blog1,471 days agoTutorial
This blog post from Replicate provides a clear and accessible introduction to running text-to-image models using Replicate's cloud API service. It serves as an…

Latest in AI

Kaiming He's All-Undergrad Team Achieves Text-to-Image With Only 258M Parameters

HiDream-O1-Image-1.5 Ranks #1 in China, #2 Globally in Text-to-Image Benchmarks, Surpassing Google and NVIDIA

PRX 第三部分：在 24 小時內訓練一個 Text-to-Image 圖像生成模型！★ 75

Text-to-Image 模型訓練設計：來自 Photoroom 消融實驗的實戰啟示★ 75

Diffusers 正式支援 FLUX-2：下一代開源圖像生成模型降臨★ 85

Google DeepMind 發表 Gemini 3 Pro 圖像模型「Nano Banana Pro」：開啟下一代視覺生成與構建★ 78

Hugging Face 社群推出用於文字生成圖像的開源偏好資料集 (Open Preference Dataset)★ 75

Diffusers 正式支援 Stable Diffusion 3.5 Large：更強大的開源圖像生成模型與優化指南★ 85

Black Forest Labs 推出全新開源影像生成模型 FLUX.1：Replicate 已支援 API 運行★ 90

Replicate Intelligence #6：Google Gemma 2 模型上線、LLM 排行榜更新與 Stable Diffusion 3 實用技巧★ 75

Diffusers 正式支援 Stable Diffusion 3：更強大的圖像生成與記憶體優化★ 80

迎來 aMUSEd：高效的輕量級 Text-to-Image 文本生成圖像模型

使用 TRL 透過 DDPO 微調 Stable Diffusion 模型★ 75

介紹 Würstchen：超快速且高效的圖像生成擴散模型★ 75

畫筆化為文字：文字生成圖像 AI 的演進簡史

Hugging Face 倫理與社會電子報 #4：文字生成圖像模型中的偏見問題

在免費版 Google Colab 上使用 🧨 diffusers 運行 DeepFloyd IF 模型

VQ-Diffusion：基於離散擴散模型的文本到圖像生成技術

Stability AI 發布 Japanese Stable Diffusion：專為日語優化的文字生成圖像模型

探索文字生成圖片模型：使用 Replicate API 輕鬆創作影像