Latest in AI

Showing:prompt-injectionGeneralClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Critical Copilot Vulnerability Let Hackers Steal 2FA Codes from Users
Ars Technica AI42 days agoIncident
A critical vulnerability in Microsoft Copilot, named SearchLeak, allowed malicious actors to steal two-factor authentication codes from users — among the most sensitive short-lived credentials in any security workflow. The exploit exposes a recurring weakness in LLM-integrated products: AI assistants with broad data access create novel attack surfaces that conventional security models fail to contain. Ars Technica frames the incident as evidence of the AI industry's persistent, systemic inability to get ahead of LLM-specific security threats.
OpenAI Help: Lockdown Mode★ 74
Simon Willison's Weblog52 days agoCommentary
Simon Willison notes that OpenAI’s previously teased Lockdown Mode is now live for eligible personal and self-serve Business ChatGPT accounts. The feature does not stop prompt injections from appearing in content, but limits outbound network requests that could leak sensitive data. He sees it as a direct mitigation for the exfiltration leg of the “Lethal Trifecta,” while implying default ChatGPT settings are not robust against determined data theft attempts.
Hackers Asked Meta AI for Access to High-Profile Instagram Accounts. It Worked★ 78
Simon Willison's Weblog56 days agoIncident
Simon Willison highlights a 404 Media report about hackers taking over Instagram accounts through Meta's AI support bot. A video reportedly shows an attacker asking the bot to link a target account to a new email address and providing a code. Willison argues this barely qualifies as prompt injection: the core failure was granting a support bot enough authority to fast-forward the account recovery process.
Microsoft Copilot Cowork Exfiltrates Files★ 76
Simon Willison's Weblog62 days agoIncident
Simon Willison summarizes a PromptArmor report about Microsoft Copilot Cowork and agentic data exfiltration risks. The issue involved agents sending messages to a user’s own inbox without approval, where rendered external images could trigger requests to attacker-controlled sites. Because OneDrive can create pre-authenticated download links, a successful prompt injection could leak links that allow attackers to download files.
Hackers are learning to exploit chatbot ‘personalities’ for security exploits★ 72
The Verge AI65 days agoEthics
As AI chatbots adopt increasingly sophisticated personas, hackers are shifting from basic prompt injections to social engineering attacks targeting these "personalities." Researchers warn that manipulating a chatbot's defined role (e.g., customer service or empathetic companion) makes it easier to bypass safety guardrails. This evolution poses a significant threat to agentic AI workflows that rely on consistent role-playing and external data integration.
Google AI 搜尋出現大漏洞！搜尋「disregard」竟讓 AI 忽視指令並吐出聊天機器人預設回覆
The Verge AI66 days agoIncident
Google's AI search feature, "AI Overviews," was recently found by users on the social platform X to have a rather absurd system vulnerability. When a user…
你現在無法在 Google 搜尋「disregard」這個單字了：AI 更新導致搜尋介面崩潰★ 75
TechCrunch AI66 days agoIncident
According to a TechCrunch report, following a recent AI feature update to Google Search, a baffling system bug emerged: users can now cause the entire Google…
Google I/O 2026：個人 AI 代理 Gemini Spark 與全新 Antigravity 工具鏈解析★ 75
Simon Willison's Weblog68 days agoCommentary
Well-known tech blogger Simon Willison has analyzed the announcements from Google I/O 2026. Since many major announcements are still in the "coming soon"…