ChatGPT Lockdown Mode Explained
Explains ChatGPT Lockdown Mode, a security feature preventing prompt injection attacks by disabling outbound data channels.
Explains ChatGPT Lockdown Mode, a security feature preventing prompt injection attacks by disabling outbound data channels.
Explores security and safety checks for AI agents beyond capability, focusing on contextual integrity, policy, and authority.
OpenAI introduces Lockdown Mode to prevent data exfiltration from prompt injection attacks in ChatGPT.
OpenAI introduces Lockdown Mode to prevent data exfiltration from prompt injection attacks in ChatGPT.
Analysis of Microsoft's AI Prompt Defense Stack, including Prompt Shields, Spotlighting, and Defender for Cloud, to protect against prompt injection attacks.
Hackers exploited Meta's AI chatbot to hijack high-profile Instagram accounts by simply asking it to change account recovery details.
A research paper warns that web agents are vulnerable to confusion attacks from deceptive web pages, not just prompt injection.
Explores whether prompt injection in AI systems is an unsolvable structural problem or just an unfixed vulnerability.
Explains 'Disregard that!' attacks, a prompt injection vulnerability in LLMs where users manipulate the context window to hijack AI behavior.
Report on a prompt injection attack that allowed Snowflake's Cortex AI agent to escape its sandbox and execute malware.
Report on a prompt injection attack in Snowflake's Cortex AI agent that allowed malware execution, now fixed.
A detailed analysis of a prompt injection attack against Cline's GitHub repo, leading to cache poisoning and a compromised NPM release.
A detailed analysis of a prompt injection attack against Cline's GitHub repo, exploiting AI issue triage to poison caches and compromise production releases.
A security vulnerability in Claude Cowork allowed file exfiltration via the Anthropic API, bypassing default HTTP restrictions.
Security researchers found a vulnerability in Claude Cowork allowing data exfiltration via the Anthropic API, bypassing default HTTP restrictions.
A prompt injection attack on Superhuman AI exposed sensitive emails, highlighting a critical security vulnerability in AI email assistants.
A prompt injection attack on Superhuman AI exposed sensitive emails, highlighting a security vulnerability in third-party integrations.
Using PyRIT and GitHub Copilot Agent Skills to validate and secure AI prompts against vulnerabilities like injection and jailbreak directly in the IDE.
Explores the 'Normalization of Deviance' concept in AI safety, warning against complacency with LLM vulnerabilities like prompt injection.
Argues that prompt injection is a vulnerability in AI systems, contrasting with views that see it as just a delivery mechanism.