Are agents restricted to what they can do, or only blocked from what they can't?
Organisations are built around maximising human autonomy. Micromanagement doesn’t scale, and decades of research show that autonomy improves performance. This works because humans have internal brakes — they care about reputation, consequences, doing good work. That self-regulation enables broad trust by default.
Agents have none of that. An agent sounds human because we want it to, but it’s a statistical machine. It fails unpredictably and doesn’t know when it’s wrong. You cannot fix this with internal reward systems, because when the model slips, its judgment slips too. There is no internal brake that stays intact when things go wrong.
This is the trust inversion: humans are restricted in what they cannot do (a short list). Agents must be restricted to what they can do (an explicit list, per task). A blocklist of what agents shouldn’t do is always incomplete — the list is infinite. An allowlist of what they can do is always bounded.
Start from zero authority. Grant explicit permissions per task. This is not a conservative choice — it’s the only model that scales.
Go deeper: AI Agents Need the Inverse of Human Trust develops the full argument for why organisations designed for human trust need the opposite for agents.
See where your organisation stands on this question.
Assess with the Agent Profiler →