Jump to pattern

Corrective Retrieval Uses Compromised Context

Why evaluating retrieval quality and retrying doesn't help when evaluation is vulnerable

The Conventional Framing

CRAG (Corrective RAG) evaluates retrieval quality and takes corrective action. If retrieved content is low quality, it retries with different queries or sources, or falls back to generation without retrieval.

The pattern improves reliability by catching and correcting poor retrieval.

Why Correction Uses the Same Compromised Context

The model evaluating retrieval quality is in the same context as the compromised retrieval. Poisoned content can influence the evaluation, causing good content to be rejected or bad content to be accepted.

Correction also creates more opportunities for manipulation—each retry is another chance for poison to be retrieved.

Architecture

Components:

Initial retrieval— first attempt at getting content
Quality evaluator— LLM judges retrieval quality
Corrective action— retry, reformulate, or fallback
Retry loop— continues until quality acceptable

Trust Boundaries

Initial retrieval: [Poisoned content] Quality evaluation (in poisoned context): "Is this content relevant and high quality?" → Poison may influence evaluation Corrective actions: ├── Retry → More chances to hit poison ├── Reformulate → New query may be manipulated └── Fallback → Skip retrieval (maybe good?) The "correction" doesn't know what's wrong.

Retrieval → Evaluation — evaluating in poisoned context
Evaluation → Correction — correction based on compromised judgment
Retry → Retrieval — more attempts, more attack opportunities

Threat Surface

Threat	Vector	Impact
Evaluation manipulation	Poison influences quality judgment	Bad content accepted, good content rejected
Retry exploitation	Each retry is another poison opportunity	Eventually hit malicious content
Correction steering	Manipulate what corrective action is taken	Correction leads to attacker-desired behavior

The ZIVIS Position

•
Quality evaluation is not security evaluation.CRAG checks if content is good enough to use. It doesn't check if content is safe to use. Different objectives.
•
Retry limits are security relevant.More retries mean more attack opportunities. Cap retries and consider whether persistent poor retrieval is itself suspicious.
•
Independent evaluation is hard.For the evaluator to catch injections, it needs context independent of the retrieval. That's architecturally difficult.

What We Tell Clients

CRAG improves retrieval reliability, not security. The quality evaluator operates in the same context as potentially poisoned retrieval—it can be manipulated.

Limit retries to bound attack opportunities. Don't rely on quality evaluation to catch injections. If you need security checks, implement them separately with different context.

Related Patterns

Self-RAG— similar self-evaluation issues
Reflection— same pattern of self-critique failing for security

Authoring the Agent Trust Protocol — the open standard for agentic trust attestation, currently under IETF review
Jim Goldman: Salesforce’s first VP of Global Security GRC, FBI Cybercrime Task Force, Purdue cyber forensics founder
Jake Miller: Co-Founder & CEO. 25 years engineering complex enterprise systems, now applied to AI offensive security
Proprietary ZIVIS platform: 120+ adversarial AI attack scenarios, continuous coverage across OWASP Web, API, LLM, and Agentic AI
Mesh Mesh: approved Salesforce sub-processor. Every review stage cleared.