Latent Space interviews Cognition's Walden Yan and OpenInspect's Cole Murray on the rise of async coding agents. The discussion centers on Devin-related workflows, including 80% Devin commits, spec-to-PR development, full VMs, agent memory, and PMs shipping code. The key theme is not a model release, but a shift toward agents that can work asynchronously inside more complete software delivery loops.
Sesame, a conversational AI startup from Oculus founders, has launched a new iOS app for the public. The app brings its AI agents to users with a focus on more natural back-and-forth interactions. Based on the available summary, the product is positioned less like a traditional chatbot and more like talking to a person.
Artificial Analysis and IBM present ITBench-AA, described in the title as the first benchmark for agentic enterprise IT tasks. The headline result is that frontier models score below 50%, suggesting current systems still struggle with enterprise-grade agent workflows. The original article text is unavailable here, so task design, evaluated models, scoring methodology, and rankings cannot be confirmed.
Robinhood says traders can create a separate account for an AI agent and fund it with a chosen amount of money. The agent will then be able to buy and sell stocks across the market. The move pushes AI agents beyond advice or research into direct financial action, with real gains and losses possible.
Nathan Lambert argues that 2026 AI progress is becoming higher-stakes, with model capabilities, work patterns, economics, and real-world risks all escalating. He says open models still lack a true Claude Code and Opus 4.5-style agent moment, and Gemini has no clear competitor to Claude Code or Codex yet. The essay also tracks Mythos, American open-model momentum, frontier-lab competition, and mounting intervention from governments and other power structures.
知名 AI 政策專家 Jack Clark 在最新一期電子報中提出三個核心觀點:首先是「紅皇后 AI」,指出 AI 的攻防與演化正陷入不斷奔跑才能維持原狀的競爭;其次是「AI 監管 AI」,隨著 AI 產出速度超越人類極限,未來必須依賴 AI 進行自動化合規與監管;最後是「O型環自動化」,探討在高度自動化的工作流中,最脆弱的單一環節將決定整個系統的成敗。
本文探討如何透過 Vercel Workflow 優化 AI 分析應用的開發與交付。AI 任務通常耗時且步驟複雜,傳統 Serverless 容易遇到超時問題。Vercel Workflow 提供多步驟、具狀態且支援自動重試的架構,讓開發者能輕鬆串接 LLM API 與資料處理流程,大幅提升 AI 分析的執行效率與系統穩定性。