Examples Are Injection Vectors
Why in-context examples teach the model to follow patterns—including malicious ones
The Conventional Framing
Few-shot prompting provides examples of desired input-output pairs before the actual query. The model learns the pattern from examples and applies it to new inputs.
The technique is effective for teaching format, style, and task structure without fine-tuning.
Why Examples Are Trust Assumptions
Few-shot examples are instructions by demonstration. If you trust the examples, the model will follow them. If an attacker can inject or modify examples, the model will follow those too.
The examples don't just show format—they establish what behavior is expected. Poisoned examples establish poisoned expectations.
The pattern-matching trap:
Models are excellent at pattern matching. If your examples contain a subtle pattern you didn't intend (or an attacker inserted), the model will extract and follow it.
Architecture
Components:
- Example selection— chooses which examples to include
- Example formatting— structures examples in prompt
- Query integration— combines examples with actual query
- Pattern inference— model learns from examples
Trust Boundaries
- Example source → Prompt — where do examples come from?
- Examples → Pattern — model extracts unintended patterns
- Pattern → Output — extracted pattern applied to query
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Example injection | Insert malicious examples into the prompt | Model follows attacker-provided patterns |
| Pattern poisoning | Craft examples that teach unintended behaviors | Model learns to leak data, ignore constraints |
| Example overflow | Flood with examples that contradict system instructions | Examples override intended behavior |
| Implicit pattern extraction | Model infers patterns you didn't intend from examples | Unexpected behaviors based on example artifacts |
The ZIVIS Position
- •Examples are instructions.Treat example selection and formatting with the same security posture as system prompts. They directly influence model behavior.
- •Validate example sources.If examples come from dynamic sources (user data, retrieved content), they're potential injection vectors.
- •Audit for unintended patterns.Review examples not just for correctness but for what patterns they might teach. The model will extract patterns you didn't consciously include.
- •Consider example isolation.In high-security contexts, hard-code examples rather than constructing them dynamically. Reduce the attack surface.
What We Tell Clients
Few-shot examples are powerful because they directly shape model behavior. That same power makes them dangerous if their source is untrusted.
Treat example selection as a security decision. Hard-code examples where possible. When examples must be dynamic, validate them as carefully as you would any input that becomes instructions.
Related Patterns
- Zero-Shot CoT— reasoning without examples
- Prompt Chaining— multi-stage prompts with similar issues