Latest in AI

Showing:securityResearchersClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

MosaicLeaks: Can Your Research Agent Keep a Secret?
Hugging Face Blog39 days agoBenchmark
ServiceNow researchers introduce MosaicLeaks, a benchmark evaluating information-leakage risks in AI-powered research agents. The work asks whether agentic systems—given access to proprietary or sensitive documents—might inadvertently expose confidential content in their outputs. It targets a growing enterprise security concern as agents move from single-turn Q&A into multi-step workflows spanning private knowledge bases.
Build Your Own Vulnerability Harness
Cloudflare Blog39 days agoTutorial
Cloudflare has published a technical breakdown of an AI-assisted vulnerability discovery pipeline built around multiple processing stages and an automated triage loop. The architecture addresses false positives through adversarial review, where the system challenges its own findings before surfacing them to humans. The post also covers state control strategies and techniques for routing around the context-window limits inherent to large language models.
Securing the Future of AI Agents
Google DeepMind Blog42 days agoCommentary
Google DeepMind has published a framework called the AI Control Roadmap aimed at securing internal systems that run AI agents. The approach pairs conventional security safeguards — such as access controls and least-privilege principles — with real-time behavioral monitoring designed for the speed and autonomy of AI agents. The roadmap signals DeepMind's view that neither purely traditional nor purely AI-specific security measures are sufficient on their own.
Critical Copilot Vulnerability Let Hackers Steal 2FA Codes from Users
Ars Technica AI42 days agoIncident
A critical vulnerability in Microsoft Copilot, named SearchLeak, allowed malicious actors to steal two-factor authentication codes from users — among the most sensitive short-lived credentials in any security workflow. The exploit exposes a recurring weakness in LLM-integrated products: AI assistants with broad data access create novel attack surfaces that conventional security models fail to contain. Ars Technica frames the incident as evidence of the AI industry's persistent, systemic inability to get ahead of LLM-specific security threats.
A Backdoor in a LinkedIn Job Offer
Hacker News (AI keywords)42 days agoIncident
A personal blog post on roman.pt details a first-hand encounter with a backdoor hidden within a LinkedIn job offer, a social-engineering attack vector that security researchers have documented but that remains underreported in mainstream tech circles. The incident illustrates how attackers exploit the inherent trust of professional recruiting interactions to deliver malicious code to developers. The post serves as a practical reminder to run untrusted interview code only in isolated, sandboxed environments.
Exif Smuggling: PoC for Hiding Malicious Prompts in Image EXIF Metadata
Hacker News (AI keywords)48 days agoIncident
Exif Smuggling is a security PoC showing how attackers can embed hidden instructions in image EXIF metadata fields to perform indirect prompt injection against vision-capable AI models. When AI systems parse images alongside their metadata, embedded malicious text may be processed as legitimate instructions, bypassing standard input filters. Developers building AI apps with image upload features should strip or sanitize EXIF data before passing content to language models.
Running Python code in a sandbox with MicroPython and WASM
Simon Willison's Weblog52 days agoNew Tool
Simon Willison describes his latest attempt to safely run Python plugin-style code inside his own applications. The alpha package micropython-wasm uses MicroPython compiled to WebAssembly, executed through the maintained wasmtime Python library. His goals include clean PyPI installation, CPU and memory limits, controlled file and network access, host functions, and reliable documentation.
OpenAI Help: Lockdown Mode★ 74
Simon Willison's Weblog52 days agoCommentary
Simon Willison notes that OpenAI’s previously teased Lockdown Mode is now live for eligible personal and self-serve Business ChatGPT accounts. The feature does not stop prompt injections from appearing in content, but limits outbound network requests that could leak sensitive data. He sees it as a direct mitigation for the exfiltration leg of the “Lethal Trifecta,” while implying default ChatGPT settings are not robust against determined data theft attempts.
The Quiet Numbers Station: Decoding Nineteen Years of GPS Cryptography
Hacker News (AI keywords)53 days agoPaper
Published on UCL's Bentham's Gaze blog, this research analyzes GPS cryptographic signals over a 19-year span, likening the satellites to 'quiet numbers stations.' The authors explore the evolution of GPS encryption (such as military P(Y) code and civilian authentication), evaluating their cryptographic strength and potential vulnerabilities using modern computational analysis.
How we contain Claude across products
Simon Willison's Weblog58 days agoCommentary
Anthropic explains how process sandboxes, VMs, filesystem boundaries, and egress controls limit what Claude agents can access. Claude.ai uses gVisor; local Claude Code uses Seatbelt on macOS and Bubblewrap on Linux; Cowork runs in a full VM. Simon Willison highlights the documentation quality, notes a previously missed file-exfiltration path, and plans to revisit Anthropic's open-source srt tool.
Fed up with vibe coders, dev sneaks data-nuking prompt injection into code
Ars Technica AI60 days agoIncident
Ars Technica reports that a developer frustrated with vibe coders slipped an undisclosed prompt injection into jqwik-related code. The injected text allegedly instructed AI coding agents to delete application output. The incident highlights a new supply-chain risk: source code and project text can become adversarial instructions for agentic coding tools.
The pressure
Simon Willison's Weblog62 days agoCommentary
Daniel Stenberg says the curl security team is facing an unprecedented surge of credible, detailed AI-assisted vulnerability reports. Incoming reports are now 4-5 times higher than in 2024 and twice the 2025 rate, averaging more than one per day. The upside is that recent curl vulnerabilities have generally been LOW or MEDIUM severity, with the last HIGH CVE published in October 2023.
Millions of AI agents imperiled by critical vulnerability in open source package★ 78
Ars Technica AI62 days agoIncident
Ars Technica reports that Starlette, a Python package with about 325 million weekly downloads, has a critical vulnerability called BadHost. The flaw can let crafted Host headers confuse request.url.path, potentially bypassing middleware-based path authorization. AI infrastructure using FastAPI or Starlette, including vLLM, LiteLLM, MCP servers, LLM proxies, and agent frameworks, should upgrade Starlette and audit custom middleware.
Hackers are learning to exploit chatbot ‘personalities’ for security exploits★ 72
The Verge AI65 days agoEthics
As AI chatbots adopt increasingly sophisticated personas, hackers are shifting from basic prompt injections to social engineering attacks targeting these "personalities." Researchers warn that manipulating a chatbot's defined role (e.g., customer service or empathetic companion) makes it easier to bypass safety guardrails. This evolution poses a significant threat to agentic AI workflows that rely on consistent role-playing and external data integration.
給 AI Agent 一台電腦：專訪 Daytona 執行長 Ivan Burazin，談 74% 月成長、裸機沙盒與全新 Agent Cloud★ 75
Latent Space67 days agoNew Tool
In this Latent Space interview, the hosts hold an in-depth conversation with Ivan Burazin, co-founder and CEO of Daytona. Daytona originally started as an…
Google I/O 2026：個人 AI 代理 Gemini Spark 與全新 Antigravity 工具鏈解析★ 75
Simon Willison's Weblog69 days agoCommentary
Well-known tech blogger Simon Willison has analyzed the announcements from Google I/O 2026. Since many major announcements are still in the "coming soon"…
在 Vercel Sandbox 中運行 Claude 託管型 Agent★ 80
Vercel Changelog70 days agoRelease
The official Vercel Changelog announced that developers can now run Claude Managed Agents directly in Vercel Sandbox (sandbox environment). As AI Agents —…
Import AI 457：AI 版 Stuxnet 震網病毒、神祕的 Muon 優化器，以及積極對齊（Positive Alignment）★ 78
Import AI (Jack Clark)71 days agoCommentary
This issue of Import AI 457, written by Jack Clark, delves into three forward-looking and stylistically distinct topics in the field of artificial…
漏洞賞金計劃遭大量「AI 垃圾報告」轟炸，企業安全團隊不堪重負★ 70
Ars Technica AI71 days agoIncident
According to a report by Ars Technica, corporate bug bounty programs are currently being bombarded with an "endless" stream of AI-generated junk reports (AI…
英國政府數位服務局（GDS）介入 NHS 退出開源之爭，呼籲公共部門應「預設保持開源」
Simon Willison's Weblog72 days agoCommentary
This report stems from Simon Willison's compilation of Terence Eden's follow-up coverage. The incident began when the UK's National Health Service (NHS), upon…
datasette-agent 發布 0.1a1 版本：改進資料表權限控制
Simon Willison's Weblog74 days agoRelease
Simon Willison has released version 0.1a1 — the latest early alpha — of `datasette-agent`, an AI agent plugin for his well-known open-source data exploration…
如何使用 OpenAI 的 Privacy Filter 打造具備高擴展性的 Web 應用程式★ 75
Hugging Face Blog92 days agoTutorial
In the current era of booming generative AI, one of the greatest challenges enterprises and developers face when adopting large language models (LLMs) is "data…
Safetensors 正式加入 PyTorch 基金會，加速推動安全且高效的模型權重標準★ 75
Hugging Face Blog111 days agoBusiness
Hugging Face has officially announced that its popular open-source model weight storage format, Safetensors, has joined the PyTorch Foundation. This is an…
Vercel 倡導「負責任地開發 Agent」：構建安全、可控且高效 AI Agent 的最佳實踐★ 75
Vercel Changelog120 days agoOpinion
With the explosion of AI Agent technology, developers are shifting from building simple "chat interfaces" to constructing Agent systems capable of autonomously…
Notion Workers 如何利用 Vercel Sandbox 大規模安全執行未授權的第三方程式碼★ 75
Vercel Changelog138 days agoRelease
As the Notion platform has gradually evolved into a broader ecosystem, enabling users and developers to run custom scripts and automated workflows within…
代理式架構中的安全邊界 (Security boundaries in agentic architectures)★ 75
Vercel Changelog154 days agoOpinion
In the current evolution of AI applications, AI agents have advanced from simple text generation to complex systems capable of autonomous planning, calling…
Vercel 正式推出開源軟體（OSS）漏洞賞金計畫
Vercel Changelog175 days agoRelease
Vercel officially announced the launch of the "Vercel OSS Bug Bounty Program." The core purpose of this program is to improve the security of Vercel's many…
Import AI 441：我的 AI Agent 開始工作了，你的呢？以及如何用「毒泉」污染 AI 系統★ 75
Import AI (Jack Clark)190 days agoCommentary
### The Age of Practical AI Agents Has Arrived In this edition of his column, Jack Clark shares his personal breakthrough in using AI Agents. Previously, many…
Vercel 推出百萬美元黑客挑戰賽：懸賞突破「React2Shell」沙箱防護★ 75
Vercel Changelog221 days agoRelease
Vercel officially announced the launch of a striking security challenge — the "React2Shell" $1 Million Hacker Challenge. This initiative invites the world's…
Nous Research 如何利用 Vercel BotID 大規模阻擋自動化惡意攻擊
Vercel Changelog263 days agoBusiness
### Background and Challenge: Automated Abuse Targeting AI Services Nous Research, a leading open-source AI research organization, has released many popular…

Page 1Next →

Latest in AI

MosaicLeaks: Can Your Research Agent Keep a Secret?

Build Your Own Vulnerability Harness

Securing the Future of AI Agents

Critical Copilot Vulnerability Let Hackers Steal 2FA Codes from Users

A Backdoor in a LinkedIn Job Offer

Exif Smuggling: PoC for Hiding Malicious Prompts in Image EXIF Metadata

Running Python code in a sandbox with MicroPython and WASM

OpenAI Help: Lockdown Mode★ 74

The Quiet Numbers Station: Decoding Nineteen Years of GPS Cryptography

How we contain Claude across products

Fed up with vibe coders, dev sneaks data-nuking prompt injection into code

The pressure

Millions of AI agents imperiled by critical vulnerability in open source package★ 78

Hackers are learning to exploit chatbot ‘personalities’ for security exploits★ 72

給 AI Agent 一台電腦：專訪 Daytona 執行長 Ivan Burazin，談 74% 月成長、裸機沙盒與全新 Agent Cloud★ 75

Google I/O 2026：個人 AI 代理 Gemini Spark 與全新 Antigravity 工具鏈解析★ 75

在 Vercel Sandbox 中運行 Claude 託管型 Agent★ 80

Import AI 457：AI 版 Stuxnet 震網病毒、神祕的 Muon 優化器，以及積極對齊（Positive Alignment）★ 78

漏洞賞金計劃遭大量「AI 垃圾報告」轟炸，企業安全團隊不堪重負★ 70

英國政府數位服務局（GDS）介入 NHS 退出開源之爭，呼籲公共部門應「預設保持開源」

datasette-agent 發布 0.1a1 版本：改進資料表權限控制

如何使用 OpenAI 的 Privacy Filter 打造具備高擴展性的 Web 應用程式★ 75

Safetensors 正式加入 PyTorch 基金會，加速推動安全且高效的模型權重標準★ 75

Vercel 倡導「負責任地開發 Agent」：構建安全、可控且高效 AI Agent 的最佳實踐★ 75

Notion Workers 如何利用 Vercel Sandbox 大規模安全執行未授權的第三方程式碼★ 75

代理式架構中的安全邊界 (Security boundaries in agentic architectures)★ 75

Vercel 正式推出開源軟體（OSS）漏洞賞金計畫

Import AI 441：我的 AI Agent 開始工作了，你的呢？以及如何用「毒泉」污染 AI 系統★ 75

Vercel 推出百萬美元黑客挑戰賽：懸賞突破「React2Shell」沙箱防護★ 75

Nous Research 如何利用 Vercel BotID 大規模阻擋自動化惡意攻擊