Simon Willison 11/4/2025

MCP Colors: Systematically deal with prompt injection risk

Read Original

This article introduces the MCP Colors system, a framework for classifying AI tools by risk: red for tools exposing agents to untrusted/malicious input, and blue for tools performing critical actions. It explains how to label tools to prevent unsafe state combinations and discusses automating the classification process for scalability in managing prompt injection threats.

MCP Colors: Systematically deal with prompt injection risk

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser