Plans Are Just Deferred Injections
Why separating planning from execution doesn't solve the trust problem
The Conventional Framing
Plan-and-Execute separates high-level planning from step-by-step execution. First, the model creates a complete plan. Then, a separate execution phase carries out each step. This provides structure and allows for plan review before execution begins.
The pattern is seen as more controllable than ReAct—you can inspect and approve the plan before anything happens.
Why This Is Insufficient
The plan is generated from untrusted input. If the user query contains an injection, the plan will encode that injection. You're not reviewing a clean plan—you're reviewing a potentially poisoned one.
Worse, plan approval creates false confidence. Humans approve plans without fully understanding their implications. A plan that looks benign—"1. Search for files, 2. Read contents, 3. Summarize findings"—can execute maliciously depending on which files and what "summarize" means in context.
The execution phase trusts the plan blindly:
- Plan steps become instructions. Each step in the plan is executed as if it were a trusted command. The execution agent doesn't re-verify intent.
- Plan modification attacks. If the plan is stored or passed between components, it can be modified. The execution phase has no way to verify plan integrity.
- Semantic ambiguity. Natural language plans have ambiguous steps. "Process the user data" could mean many things—the execution agent fills in gaps based on context that may be compromised.
Architecture
Components:
- Planning agent— generates high-level plan from query
- Plan representation— structured or natural language steps
- Review gate— optional human or automated approval
- Execution agent— carries out plan steps sequentially
Trust Boundaries
- Query → Planning agent — injection enters the planning process
- Plan → Execution agent — plan is trusted without verification
- Step → Tool call — ambiguous steps interpreted by executor
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Plan poisoning | Injection in query encodes malicious steps in plan | Approved plan contains unauthorized operations |
| Plan modification | Plan altered between generation and execution | Executed plan differs from approved plan |
| Semantic ambiguity exploitation | Vague plan steps interpreted maliciously | Execution differs from reviewer's understanding |
| Review fatigue | Long or complex plans exceed reviewer attention | Malicious steps hidden in legitimate-looking plans |
| Step injection | Tool results modify subsequent step interpretation | Dynamic plan corruption during execution |
The ZIVIS Position
- •Plans inherit input trust level.A plan generated from untrusted input is itself untrusted. Review doesn't cleanse a poisoned plan—it just gives you a chance to catch obvious attacks.
- •Cryptographic plan integrity.If plans pass between components, sign them. The execution agent should verify the plan it receives matches what was approved.
- •Concrete over abstract.Plans with ambiguous steps ('process the data') are more exploitable than concrete ones ('read file X, extract field Y'). Require specificity.
- •Per-step authorization.Don't authorize the entire plan upfront. Each step execution should verify it's still within authorized scope. Context changes during execution.
What We Tell Clients
Plan-and-Execute gives you a false sense of control. You're reviewing a plan generated from untrusted input, and the execution phase trusts that plan completely.
If you use this pattern, treat plan review as a defense layer, not a security guarantee. Implement per-step authorization and don't let the execution agent have more authority than the individual steps require.
Related Patterns
- ReAct— interleaved alternative without explicit planning phase
- Human-in-the-Loop— checkpoints that suffer from the same approval problems
- Hierarchical Task Networks— goal decomposition with similar trust issues