Anthropic explains how process sandboxes, VMs, filesystem boundaries, and egress controls limit what Claude agents can access. Claude.ai uses gVisor; local Claude Code uses Seatbelt on macOS and Bubblewrap on Linux; Cowork runs in a full VM. Simon Willison highlights the documentation quality, notes a previously missed file-exfiltration path, and plans to revisit Anthropic's open-source srt tool.
Simon Willison revisited pydantic-monty, a sandboxed subset of Python implemented in Rust. He asked Claude Code to inspect the most recent release, following his earlier exploration a few months ago. The key finding is that limits for execution duration, memory, allocations, and recursion depth all appear to behave as advertised.
Simon Willison 釋出 `datasette-agent-sprites 0.1a0` 測試版外掛。該外掛專為 Datasette Agent 設計,使其能夠在 Fly.io 推出的 Fly Sprites 輕量級沙盒環境中安全地執行任意指令。這為 AI 代理(Agent)提供了一個隔離且安全的程式碼執行空間,避免損害主機系統。
隨著 AI Agent(代理)逐漸具備自主執行工具與呼叫 API 的能力,傳統的安全防護已不敷使用。Vercel 提出在代理式架構中建立「安全邊界」的關鍵指引,強調必須實施執行期沙盒化(Sandboxing)、嚴格的最小權限原則(Least Privilege),以及在關鍵決策中引入「人工確認(Human-in-the-loop)」機制,以防止提示詞注入與越權操作。
Hugging Face 正式發表 Gaia2 基準測試與 ARE (Agent Run Environment) 框架。Gaia2 延續前代精神,設計了更複雜、防污染且貼近真實世界的多模態任務;而 ARE 則提供安全沙盒化的執行環境,解決了 Agent 測試中重現性低與安全風險的痛點。這套組合將大幅降低社群研究與評估 AI Agent 的門檻。