Jonathan Kingston • 6/8/2026

Appropriateness is what safety cannot mechanise

This article examines the gap between structural safety checks (like tool-use restrictions and platform classifiers) and the contextual harm that can occur when an AI agent's actions are technically valid but inappropriate for the recipient. Using examples like sending an alcohol offer to a recovering alcoholic, it argues that current safety mechanisms—including per-call gates, platform classifiers, and per-tool evals—fail to catch harms that depend on unobservable recipient states. The piece advocates for deployment-specific, context-dependent protections that go beyond generic filters, emphasizing that safety must address meaning and trajectory, not just structure.

0 comments

#AI Safety #Agent Safety #Product Safety