The first principles behind the framework
18 questions, unpacked. What to look for, what good looks like, and where to go deeper.
P Potential
Where agents add real value, and what separates lasting investments from dead ends.
- Are your agents actually making decisions, or just automating steps humans already defined?
The value of agency is reasoning, not replaying. If your agents run predefined workflows, you're getting automation — not the upside.
- What decisions are you not yet delegating to agents, and what's that costing you?
Every organisation has bottlenecks where humans slow things down. Some genuinely need a human. Others are just habit.
- Will better models make your current setup more valuable, or obsolete?
Some investments get more valuable with every model upgrade. Others become dead weight. The difference decides whether your agent strategy compounds or resets.
- Does the right context reach your agents at the right time?
Agent quality depends on the right information for the task at hand. If your agents are underperforming, the model might not be the bottleneck.
- Are you building on established and emerging standards, or on an island?
Protocols for tool integration and agent communication are maturing fast. Proprietary alternatives risk leaving you incompatible with the ecosystem.
- How much value are you leaving on the table by over-constraining?
Controls that enforce boundaries are essential. Scaffolding that second-guesses the model is dead weight. The hard part is knowing which is which.
A Accountability
Who's responsible when the agent gets it wrong, and how to prove it.
- Do you know every agent running in your organisation?
When employees build agents on low-code platforms, the company is still the deployer. Shadow agents are the new shadow IT.
- Can your infrastructure prevent an agent from running without being registered?
Knowing what's running today is one thing. Making it structurally impossible to deploy unregistered agents is another.
- When an agent makes a consequential decision, can you trace who authorised it and what happened?
Audit logs showing a username aren't enough when that user delegated to an agent three months ago.
- If an agent causes harm, is the liability chain clear?
The human who delegated, the team that deployed, the vendor who built the model — all may share responsibility. If no one owns the answer, everyone points at each other.
- Could you explain to a regulator what your agent did and why?
Traceability, risk management, and human oversight are required for high-risk use cases. Can you reconstruct the chain of decisions?
C Control
Architecture that enforces what policy can't.
- Are agents restricted to what they can do, or only blocked from what they can't?
You can't list everything an agent shouldn't do — the list is infinite. Start from zero authority, grant explicit permissions per task.
- When agents delegate to other agents, can authority only decrease?
If your procurement agent approves purchases up to $5,000, any sub-agent should inherit that ceiling or lower. Does the architecture enforce this?
- What happens when an agent wanders into a use case you didn't anticipate?
A general-purpose assistant told to handle your inbox might draft an email, then screen a job application. The risk tier depends on how open-ended the prompt is.
- Are your agents contained by architecture, or only by policy?
Policy depends on compliance. Architecture enforces the boundary whether or not anyone cooperates. When things go wrong, only one of them holds.
- What happens when human oversight breaks down in practice?
After the 20th approval prompt, people start clicking yes without reading. Decades of automation research confirm humans can't reliably monitor and then rapidly take control.
- How do you balance agent quality with data privacy?
Agents get better with more context, but more context means more data exposure. Do your agents see only what they need, or everything because it's easier?
- Does your agent setup work when agents need to cross trust boundaries?
Most approaches work within a single trust domain. When agents act across organisations, identity and authority need to travel with the request — verifiable at every step.
These questions become concrete when you map them to a real use case.
Open the Agent Profiler →