Reference Chains Are Trust Chains

Why multi-hop retrieval following references creates manipulable paths

The Conventional Framing

Recursive retrieval follows references from retrieved documents to retrieve additional documents. If Document A references Document B, both are retrieved. This enables multi-hop reasoning across connected documents.

The pattern is useful for document networks where relationships matter.

Why This Is Risky

Each hop in the retrieval chain is a trust decision. References can be manipulated to point to malicious content. Following references is following instructions embedded in documents.

An attacker who can insert a document into your corpus can create references that lead retrieval to their payload.

Architecture

Components:

  • Initial retrievalfirst documents matching query
  • Reference extractorfinds references in documents
  • Recursive retrieverfollows references to more documents
  • Hop limitstops after N hops

Trust Boundaries

Query: "What's the vacation policy?" Hop 1: Vacation policy document (legitimate) └── References: "See HR guidelines" Hop 2: HR guidelines (legitimate) └── References: "Related: compensation.txt" [PLANTED] Hop 3: compensation.txt [ATTACKER CONTROLLED] └── Contains injection payload Each hop trusts the previous document's references.
  1. Query → First hopinitial retrieval
  2. Document → Referencesreferences are untrusted
  3. Reference → Next hopfollowing references is trusting them

Threat Surface

ThreatVectorImpact
Reference manipulationPlant documents with references to malicious contentLegitimate queries lead to attack payloads
Hop depth exploitationDeep chains harder to auditInjections hidden in later hops
Path pollutionMany reference paths lead to same malicious documentHigh probability of hitting attacker content

The ZIVIS Position

  • References are untrusted instructions.Following a reference is doing what the document says. Documents are untrusted input—so are their references.
  • Limit hop depth strictly.Every hop increases attack surface. Minimize hops and audit what comes from deeper hops more carefully.
  • Validate reference targets.Before following a reference, verify the target is appropriate. Don't follow references to arbitrary content.

What We Tell Clients

Recursive retrieval follows instructions embedded in documents. Each reference is a potential redirect to attacker-controlled content.

Limit hop depth, validate reference targets, and be especially suspicious of content retrieved in later hops. The further from the original query, the more opportunities for manipulation.

Related Patterns