Kaiming He's All-Undergrad Team Achieves Text-to-Image With Only 258M Parameters
Original: 全员本科生!何恺明组新作:文生图,258M参数就够了
Kaiming He's group, composed entirely of undergraduates, proposes a compact 258M-parameter text-to-image model.
A new research paper from Kaiming He's lab — notable for having an all-undergraduate team — demonstrates that high-quality text-to-image generation can be achieved with just 258 million parameters. This challenges the prevailing assumption that competitive image synthesis requires multi-billion-parameter models. The work signals a push toward leaner, more accessible generative vision architectures.
A new paper from the research group led by Kaiming He — the computer vision luminary best known for co-inventing ResNet and currently based at MIT CSAIL — proposes a text-to-image generation system that operates with only 258 million parameters. The headline figure is striking: leading open-source and commercial text-to-image models typically run in the range of one to several billion parameters, making a sub-300M system genuinely unusual. The article's title also draws attention to the composition of the team: all members are undergraduates, an uncommon distinction for work appearing at this level of visibility in the AI research community.
Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.
See Pro plans →Want the original English / full article?
Read on 量子位 QbitAI →Related
Summaries are AI-generated; the original article is authoritative.