Latest in AI

Showing:ai-agentsResearchersOtherClear ×

🔥 Trending today

nvidia5 microsoft-build4 agentic-ai4 enterprise-ai4 security3 ai-assistant3 fundraising3 rtx-spark2 ai-agents2 fraud-prevention2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray
Latent Space5d agoCommentary
Latent Space interviews Cognition's Walden Yan and OpenInspect's Cole Murray on the rise of async coding agents. The discussion centers on Devin-related workflows, including 80% Devin commits, spec-to-PR development, full VMs, agent memory, and PMs shipping code. The key theme is not a model release, but a shift toward agents that can work asynchronously inside more complete software delivery loops.
ITBench-AA: Frontier Models Score Below 50% on Enterprise IT Tasks★ 72
Hugging Face Blog6d agoBenchmark
Artificial Analysis and IBM present ITBench-AA, described in the title as the first benchmark for agentic enterprise IT tasks. The headline result is that frontier models score below 50%, suggesting current systems still struggle with enterprise-grade agent workflows. The original article text is unavailable here, so task design, evaluated models, scoring methodology, and rankings cannot be confirmed.
Some ideas for what comes next, May 2026
Interconnects (Nathan L.)7d agoCommentary
Nathan Lambert argues that 2026 AI progress is becoming higher-stakes, with model capabilities, work patterns, economics, and real-world risks all escalating. He says open models still lack a true Claude Code and Opus 4.5-style agent moment, and Gemini has no clear competitor to Claude Code or Codex yet. The essay also tracks Mythos, American open-model momentum, frontier-lab competition, and mounting intervention from governments and other power structures.
Import AI 440: Red queen AI; AI regulating AI; o-ring automation ★ 75
Import AI (Jack Clark)141d agoOpinion
知名 AI 政策專家 Jack Clark 在最新一期電子報中提出三個核心觀點：首先是「紅皇后 AI」，指出 AI 的攻防與演化正陷入不斷奔跑才能維持原狀的競爭；其次是「AI 監管 AI」，隨著 AI 產出速度超越人類極限，未來必須依賴 AI 進行自動化合規與監管；最後是「O型環自動化」，探討在高度自動化的工作流中，最脆弱的單一環節將決定整個系統的成敗。