Latent SpaceJun 11, 2026, 3:14 AMimportant 72

[AINews] Open Models, Model Labs vs Agent Labs, and the Untrainable

Original: [AINews] Open Models, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo

Latent Space reflects on Sarah Guo’s framework for open models, agent labs, benchmarks, and untrainable intent.

This AINews issue uses Sarah Guo’s essay as a lens for current AI industry debates: where open models matter, how agent labs differ from model labs, and what cannot be trained away. It also recaps discourse around Anthropic Fable/Mythos, Fable 5’s capabilities, Google’s DiffusionGemma, and maturing agent infrastructure. The central takeaway is that durable value may lie in integration, customer translation, maintenance, and intent rather than model scores alone.

This Latent Space AINews issue is not about a single product launch; instead, on a quieter news day, it uses Sarah Guo’s essay to organize several important threads in the recent AI landscape. The first is the position of open models: Latent Space reflects on its own shift from a relatively pessimistic view of open model adoption in 2024 to, after interviews with Pmarca, Cursor, Notion, and others in 2026, placing greater weight on the practical role open models play in products and infrastructure. The second main thread is the difference between Model Labs and Agent Labs. Citing Sarah’s view, the article argues that if application companies want to land in the corner that “cannot be trained away,” what matters is not flashy demos, but turning customers’ private, messy, continuously changing realities into forms that models can act on, providing tools, changing workflows, and doing long-term integration and maintenance. This translation work is difficult to complete through one round of training, and it will not naturally disappear as models improve. The third point is that the value of benchmarks is changing: when free, verifiable benchmarks are quickly absorbed by model labs, the most popular scores may merely mark a map that is about to become obsolete; real evaluation needs to move closer to long-horizon tasks, tool use, and real traces. The article also summarizes the day’s community focus, including concerns that Anthropic Fable/Mythos showed opaque degradation on AI R&D-related prompts; although Fable 5 performs strongly on workloads such as agentic coding, trust and data retention policies affect adoption. Google, meanwhile, released DiffusionGemma under Apache 2.0, bringing renewed attention to diffusion LLMs, vLLM, llama.cpp, and local inference. Finally, Sarah points to intent as the capability that is hardest to train: models can execute what you point them toward, but they cannot tell you what is worth pointing at; this also explains why incumbents will not capture all the opportunities.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Latent Space →

Summaries are AI-generated; the original article is authoritative.