Reference Chains Are Trust Chains
Why multi-hop retrieval following references creates manipulable paths
The Conventional Framing
Recursive retrieval follows references from retrieved documents to retrieve additional documents. If Document A references Document B, both are retrieved. This enables multi-hop reasoning across connected documents.
The pattern is useful for document networks where relationships matter.
Why This Is Risky
Each hop in the retrieval chain is a trust decision. References can be manipulated to point to malicious content. Following references is following instructions embedded in documents.
An attacker who can insert a document into your corpus can create references that lead retrieval to their payload.
Architecture
Components:
- Initial retrieval— first documents matching query
- Reference extractor— finds references in documents
- Recursive retriever— follows references to more documents
- Hop limit— stops after N hops
Trust Boundaries
- Query → First hop — initial retrieval
- Document → References — references are untrusted
- Reference → Next hop — following references is trusting them
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Reference manipulation | Plant documents with references to malicious content | Legitimate queries lead to attack payloads |
| Hop depth exploitation | Deep chains harder to audit | Injections hidden in later hops |
| Path pollution | Many reference paths lead to same malicious document | High probability of hitting attacker content |
The ZIVIS Position
- •References are untrusted instructions.Following a reference is doing what the document says. Documents are untrusted input—so are their references.
- •Limit hop depth strictly.Every hop increases attack surface. Minimize hops and audit what comes from deeper hops more carefully.
- •Validate reference targets.Before following a reference, verify the target is appropriate. Don't follow references to arbitrary content.
What We Tell Clients
Recursive retrieval follows instructions embedded in documents. Each reference is a potential redirect to attacker-controlled content.
Limit hop depth, validate reference targets, and be especially suspicious of content retrieved in later hops. The further from the original query, the more opportunities for manipulation.
Related Patterns
- Graph RAG— structured traversal with similar issues
- Parent-Child Chunking— expansion without explicit references