Anthropic explains how process sandboxes, VMs, filesystem boundaries, and egress controls limit what Claude agents can access. Claude.ai uses gVisor; local Claude Code uses Seatbelt on macOS and Bubblewrap on Linux; Cowork runs in a full VM. Simon Willison highlights the documentation quality, notes a previously missed file-exfiltration path, and plans to revisit Anthropic's open-source srt tool.
Simon Willison summarizes a PromptArmor report about Microsoft Copilot Cowork and agentic data exfiltration risks. The issue involved agents sending messages to a user’s own inbox without approval, where rendered external images could trigger requests to attacker-controlled sites. Because OneDrive can create pre-authenticated download links, a successful prompt injection could leak links that allow attackers to download files.