Jonathan Kingston 6/8/2026

Appropriateness is what safety cannot mechanise

Read Original

This article examines the gap between structural safety checks (like tool-use restrictions and platform classifiers) and the contextual harm that can occur when an AI agent's actions are technically valid but inappropriate for the recipient. Using examples like sending an alcohol offer to a recovering alcoholic, it argues that current safety mechanisms—including per-call gates, platform classifiers, and per-tool evals—fail to catch harms that depend on unobservable recipient states. The piece advocates for deployment-specific, context-dependent protections that go beyond generic filters, emphasizing that safety must address meaning and trajectory, not just structure.

Appropriateness is what safety cannot mechanise

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser

Top of the Week

No top articles yet