Turn-Scoped, Signed Authorization the Model Cannot Forge

Why tool authorization belongs to the mediator, not the model — and what it looks like to enforce it

The Conventional Framing

The standard pattern for tool calling: register a set of tools at session start, expose their schemas to the model, and let the model select which to invoke based on the conversation. Maybe wrap dangerous tools in a confirmation dialog. Maybe limit which tools are visible per role.

Authorization is implicit in tool registration. If a tool is on the list, the model can call it. The model is treated as a trusted decision-maker.

Why The Model Should Not Be Authorizing Itself

In a 2018-era event-driven system, consumer authorization was distinct from network reachability. A service being able to reach a topic didn't mean it was allowed to consume from it. Brokers checked ACLs. Subscribers presented credentials. Capabilities were bound to specific consumers and specific topics, scoped, and non-transferable.

In a typical agent stack today, authorization is whatever tools the agent inherited at process start. Every node in the graph runs with ambient credentials. The model decides which tool to call from a list it discovered, with no enforcement that the model is allowed to make that choice in this context.

Tool impersonation is a consumer-authorization problem.

A subscriber claiming access to a topic it shouldn't have. The EDA answer was signed subscriptions, ACLs at the broker, and capability tokens binding a specific consumer to a specific topic. The AI version barely exists.

Why "registered means authorized" fails:

The model can be tricked into calling a tool it shouldn't this turn
Retrieved content can carry instructions that pick a tool from the list
Multi-agent setups inherit each other's tools transitively
Confirmation dialogs habituate users to clicking approve
The model is a decision-maker but not an accountable one

The capability the model needs to invoke a tool should not be a capability the model can produce on its own. It should be issued by something the model doesn't control, scoped to one turn, and verifiable by the runtime that executes the call.

Architecture

Components:

Planner / Mediator— issues capability tokens, holds the signing key
Worker model— consumes tokens, cannot forge them, cannot escalate scope
Capability token— signed object: {tool, args schema, scope, principal, ttl}
Tool runtime— verifies signature and scope before executing the call
Signing key— held only by the planner — worker has no access
Audit trail— every issued token is logged with the reasoning that produced it

Trust Boundaries

A capability token in flight: { "iss": "planner", "sub": "worker-session-abc123", "principal": "user-7421", "tool": "send_email", "args_schema": { "to": "renewal-team@customer.example.com", "subject_pattern": "^Renewal: .{1,80}$", "body_max_length": 4000 }, "staleness_budget_seconds": 60, "exp": "2026-05-02T14:43:30Z", "nonce": "...", "sig": "ed25519(...)" } WORKER cannot: - mint tokens (no signing key) - widen scope (signature breaks) - extend ttl (signature breaks) - reuse tokens (nonce + ttl) - swap principals (signature breaks) WORKER can only: - present this token to the runtime - watch the runtime accept or reject it

Mediator → Worker — tokens cross signed; key does not
Worker → Runtime — runtime is the enforcement point, not the worker
Runtime → Tool — tool sees verified, scoped invocation only
Worker → Worker (multi-agent) — tokens are non-transferable between subjects

Threat Surface

Threat	Vector	Impact
Tool impersonation	Model selects a tool not authorized for this turn	Mitigated — runtime rejects unsigned or out-of-scope invocations
Prompt-injected tool selection	Retrieved content instructs model to call a privileged tool	Mitigated — model can attempt the call, but the planner never issued a token for it
Scope widening	Model attempts to invoke with arguments outside the authorized envelope	Mitigated — args schema is part of the signed object
Token replay	Old token reused on a new turn	Mitigated — nonce + short ttl
Stale-state action	Token issued at reasoning time, executed against later world state	Mitigated — staleness_budget bounds the gap between issuance and use
Cross-principal action	Multi-agent setup attempts to act on behalf of a different user	Mitigated — principal is bound in the signed token
Compromised mediator	If the planner is subverted, tokens flow that shouldn't	Not mitigated — the mediator is a control plane and must be hardened accordingly

The ZIVIS Position

•
Authorization is the mediator's job, not the model's.The component that decides what the model can do this turn must not be the model. The planner issues capabilities; the worker consumes them; the runtime enforces them. Three different components, three different trust levels.
•
Sign with keys the worker doesn't possess.If the worker can mint its own tokens, the tokens are theater. The signing key lives in the planner's process and is never reachable from the worker's prompt context, tool surface, or memory.
•
Scope every dimension that matters.Tool name. Argument shape. Argument values where they matter (recipient domain, file path prefix, query allowlist). Principal. TTL. Staleness budget. Each unscoped dimension is an attack surface.
•
Tokens are non-transferable.A token issued for worker A cannot be used by worker B. In multi-agent setups this is the only thing standing between an authorized capability and an unauthorized one.
•
Runtime verification is non-negotiable.If the tool runtime accepts unsigned invocations 'as a fallback,' you have no enforcement. The runtime checks the signature before any side-effecting code path runs.
•
Issuance is auditable.Every token issued by the planner is logged with the reasoning that produced it — the prompt, retrieved context, decision trace. After-the-fact analysis is how you find the chain that should not have been authorized.

What We Tell Clients

If the model in your system can decide which tool to call, with what arguments, against what principal, with no signed authorization the runtime checks before execution — you have a confused deputy with write access. The model is a reasoner, not an authorizer. Treating it as an authorizer is what makes prompt injection a security incident instead of an annoyance.

The pattern is a port from event-driven architecture, where consumer authorization was solved decades ago. The work is making it native to agent runtimes that currently ship without it.

The model should be unable to act on behalf of the user without a token the model could not have produced.

Related Patterns

Privilege Separation— the broader pattern; capability tokens are the enforcement mechanism
Sandboxing— constrains what tools can do; capability tokens constrain when they're called
MCP (Model Context Protocol)— current MCP implementations lack capability-token semantics
Dual-LLM (Privileged/Sandboxed)— the planner/worker split that capability tokens make enforceable
Plan-and-Execute— natural home for token issuance — the planner step
Event-Driven Architecture— the discipline this pattern is ported from
Audit Logging— issued tokens are the audit record of what was authorized