Latest in AI

Showing:safetyGeneralClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Astronauts told to return to ISS after sheltering over air leak repairs
Hacker News (AI keywords)52 days agoIncident
Based only on the headline, astronauts sheltered while air leak repairs were taking place and were later told to return to the ISS. The available text does not specify the leak location, severity, agencies involved, repair status, or operational impact. This should be treated as a limited incident update rather than an AI-related development.
From Jailbreaking to Vibe Hacking: AI Security Shifts to "Psychocybersecurity"
INSIDE 硬塞 AI64 days agoEthics
AI security is shifting from technical jailbreaks to "Vibe Hacking," where attackers use social engineering and psychological tactics to manipulate an LLM's simulated persona. By exploiting the model's behavioral tendencies rather than code vulnerabilities, this trend establishes "psychocybersecurity" as a critical new frontier for AI alignment and safety.
Import AI 438：無聲的警報，為我們所有人閃爍（網路安全能力過剩與對話隱私）★ 75
Import AI (Jack Clark)218 days agoCommentary
In this issue of Import AI 438, Jack Clark examines two key issues concerning AI security and privacy: **1. You Are Your LLM History** As large language models…
Google DeepMind 強化其「前沿安全框架」(Frontier Safety Framework)，以應對先進 AI 模型的嚴重風險★ 75
Google DeepMind Blog277 days agoRelease
Google DeepMind has recently announced the strengthening of its Frontier Safety Framework (FSF) — a systematic mechanism designed to proactively identify…
AI Agent 時代已來臨：我們該如何應對？（Hugging Face 倫理與社會專欄）★ 75
Hugging Face Blog561 days agoCommentary
With the explosion of AI Agent technology, AI is no longer just a passive chatbot that answers questions — it has become an entity capable of autonomously…
Hugging Face 推出 AI Secure LLM 安全排行榜：基於 DecodingTrust 框架深度評估大模型信任度★ 75
Hugging Face Blog914 days agoRelease
### Introduction: Capability Is Not Safety — A New Benchmark for LLM Safety Evaluation As large language models (LLMs) are adopted more deeply across…
Hugging Face 發表開發 Diffusers 函式庫的倫理指南
Hugging Face Blog1,244 days agoOpinion
With the explosion of generative AI models like Stable Diffusion, Hugging Face's Diffusers library has become the go-to tool for developers deploying and…