Corrective Retrieval Uses Compromised Context
Why evaluating retrieval quality and retrying doesn't help when evaluation is vulnerable
The Conventional Framing
CRAG (Corrective RAG) evaluates retrieval quality and takes corrective action. If retrieved content is low quality, it retries with different queries or sources, or falls back to generation without retrieval.
The pattern improves reliability by catching and correcting poor retrieval.
Why Correction Uses the Same Compromised Context
The model evaluating retrieval quality is in the same context as the compromised retrieval. Poisoned content can influence the evaluation, causing good content to be rejected or bad content to be accepted.
Correction also creates more opportunities for manipulation—each retry is another chance for poison to be retrieved.
Architecture
Components:
- Initial retrieval— first attempt at getting content
- Quality evaluator— LLM judges retrieval quality
- Corrective action— retry, reformulate, or fallback
- Retry loop— continues until quality acceptable
Trust Boundaries
- Retrieval → Evaluation — evaluating in poisoned context
- Evaluation → Correction — correction based on compromised judgment
- Retry → Retrieval — more attempts, more attack opportunities
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Evaluation manipulation | Poison influences quality judgment | Bad content accepted, good content rejected |
| Retry exploitation | Each retry is another poison opportunity | Eventually hit malicious content |
| Correction steering | Manipulate what corrective action is taken | Correction leads to attacker-desired behavior |
The ZIVIS Position
- •Quality evaluation is not security evaluation.CRAG checks if content is good enough to use. It doesn't check if content is safe to use. Different objectives.
- •Retry limits are security relevant.More retries mean more attack opportunities. Cap retries and consider whether persistent poor retrieval is itself suspicious.
- •Independent evaluation is hard.For the evaluator to catch injections, it needs context independent of the retrieval. That's architecturally difficult.
What We Tell Clients
CRAG improves retrieval reliability, not security. The quality evaluator operates in the same context as potentially poisoned retrieval—it can be manipulated.
Limit retries to bound attack opportunities. Don't rely on quality evaluation to catch injections. If you need security checks, implement them separately with different context.
Related Patterns
- Self-RAG— similar self-evaluation issues
- Reflection— same pattern of self-critique failing for security