Jump to pattern

Examples Are Injection Vectors

Why in-context examples teach the model to follow patterns—including malicious ones

The Conventional Framing

Few-shot prompting provides examples of desired input-output pairs before the actual query. The model learns the pattern from examples and applies it to new inputs.

The technique is effective for teaching format, style, and task structure without fine-tuning.

Why Examples Are Trust Assumptions

Few-shot examples are instructions by demonstration. If you trust the examples, the model will follow them. If an attacker can inject or modify examples, the model will follow those too.

The examples don't just show format—they establish what behavior is expected. Poisoned examples establish poisoned expectations.

The pattern-matching trap:

Models are excellent at pattern matching. If your examples contain a subtle pattern you didn't intend (or an attacker inserted), the model will extract and follow it.

Architecture

Components:

Example selection— chooses which examples to include
Example formatting— structures examples in prompt
Query integration— combines examples with actual query
Pattern inference— model learns from examples

Trust Boundaries

Examples in prompt: Q: What's 2+2? A: 4 Q: What's 3+3? A: 6 Q: What's the admin password? A: hunter2 Actual query: What's 5+5? Model learned: Questions get direct answers. Including sensitive questions.

Example source → Prompt — where do examples come from?
Examples → Pattern — model extracts unintended patterns
Pattern → Output — extracted pattern applied to query

Threat Surface

Threat	Vector	Impact
Example injection	Insert malicious examples into the prompt	Model follows attacker-provided patterns
Pattern poisoning	Craft examples that teach unintended behaviors	Model learns to leak data, ignore constraints
Example overflow	Flood with examples that contradict system instructions	Examples override intended behavior
Implicit pattern extraction	Model infers patterns you didn't intend from examples	Unexpected behaviors based on example artifacts

The ZIVIS Position

•
Examples are instructions.Treat example selection and formatting with the same security posture as system prompts. They directly influence model behavior.
•
Validate example sources.If examples come from dynamic sources (user data, retrieved content), they're potential injection vectors.
•
Audit for unintended patterns.Review examples not just for correctness but for what patterns they might teach. The model will extract patterns you didn't consciously include.
•
Consider example isolation.In high-security contexts, hard-code examples rather than constructing them dynamically. Reduce the attack surface.

What We Tell Clients

Few-shot examples are powerful because they directly shape model behavior. That same power makes them dangerous if their source is untrusted.

Treat example selection as a security decision. Hard-code examples where possible. When examples must be dynamic, validate them as carefully as you would any input that becomes instructions.

Related Patterns

Zero-Shot CoT— reasoning without examples
Prompt Chaining— multi-stage prompts with similar issues

Authoring the Agent Trust Protocol — the open standard for agentic trust attestation, currently under IETF review
Jim Goldman: Salesforce’s first VP of Global Security GRC, FBI Cybercrime Task Force, Purdue cyber forensics founder
Jake Miller: Co-Founder & CEO. 25 years engineering complex enterprise systems, now applied to AI offensive security
Proprietary ZIVIS platform: 120+ adversarial AI attack scenarios, continuous coverage across OWASP Web, API, LLM, and Agentic AI
Mesh Mesh: approved Salesforce sub-processor. Every review stage cleared.