Different Methods, Different Injection Profiles

Why combining dense and sparse retrieval creates a union of vulnerabilities

The Conventional Framing

Fusion retrieval combines dense (vector) and sparse (BM25/keyword) retrieval, merging results via reciprocal rank fusion or other algorithms. Different methods find different relevant documents.

The pattern improves recall by leveraging complementary retrieval approaches.

Why This Unions Attack Vectors

Dense and sparse retrieval have different vulnerabilities. Fusion doesn't pick the more secure one—it combines results, meaning attacks against either method can succeed.

You don't get the intersection of safety properties. You get the union of attack vectors.

Architecture

Components:

  • Dense retrievalembedding-based semantic search
  • Sparse retrievalkeyword/term-based search
  • Fusion algorithmcombines and ranks results

Trust Boundaries

Attack vectors by method: Dense retrieval: - Adversarial embeddings - Semantic similarity gaming Sparse retrieval: - Keyword stuffing - Term injection Fusion results include attacks from BOTH. Security = union of attack surfaces.
  1. Query → Densedense-specific attacks
  2. Query → Sparsesparse-specific attacks
  3. Fusion → Resultsattacks from either method succeed

Threat Surface

ThreatVectorImpact
Method-specific attacksExploit weaknesses unique to each retrieval typeAttack succeeds against whichever method is weaker
Fusion manipulationScore high in both methods to dominate merged resultsMalicious content at top of fused rankings
Attack surface unionFusion combines vulnerabilities of all methodsMore attack vectors, not fewer

The ZIVIS Position

  • Fusion is additive for attack surface.You're not getting the most secure method. You're exposing vulnerabilities of all methods you fuse.
  • Security requires method-specific defenses.Each retrieval method needs its own security measures. Fusion doesn't provide unified protection.
  • Consider which method to trust.When dense and sparse disagree on a result, that disagreement might be a security signal, not just relevance noise.

What We Tell Clients

Fusion retrieval combines attack surfaces, not security properties. An attack that works against either method will succeed.

Implement security measures for each retrieval method independently. Consider whether disagreement between methods (one retrieves, one doesn't) should be a flag for additional scrutiny.

Related Patterns