Summaries Can Preserve and Concentrate Poison
Why summarizing conversation history can distill injections into persistent influence
The Conventional Framing
Conversation summary condenses conversation history into summaries instead of storing raw messages. This enables long-term context within token limits by storing the essence of past interactions.
The pattern enables conversations that span many turns without context window overflow.
Why Summarization Can Amplify Injection
The model doing the summarization operates in a context that may contain injection. The injection can influence what gets summarized and how— potentially embedding itself into the summary that persists.
A well-crafted injection can make itself seem important to summarize, while making defensive instructions seem unimportant to drop.
The distillation problem:
Summarization concentrates information. If injection is treated as important, it gets concentrated into every future context. Less text, same injection, processed on every turn.
Architecture
Components:
- Summarization model— generates summaries from history
- Summary storage— persists summarized context
- Update triggers— when to re-summarize
- Summary integration— how summary enters new calls
Trust Boundaries
- History → Summarizer — summarizer sees poisoned history
- Summarizer → Summary — injection influences summary
- Summary → Future context — poisoned summary persists
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Summary injection | Craft input that gets preserved in summary | Injection persists in concentrated form |
| Importance manipulation | Make injection seem important to summarize | Summary prioritizes malicious content |
| Constraint removal | Make safety instructions seem unimportant | Defensive context dropped from summary |
| Summary replacement | Inject instructions about what summary should contain | Attacker controls persistent context |
The ZIVIS Position
- •Summarization is model-mediated.The summary is generated by a model in potentially compromised context. The summarizer is an attack target.
- •Consider separate summarization context.Summarize in a context that doesn't include the content being summarized—if architecturally possible.
- •Validate summaries.Check generated summaries for injection markers before persisting. Unusual content in summaries is suspicious.
- •Treat summaries as untrusted.A summary is model output generated from potentially poisoned input. Don't trust it more than raw conversation.
What We Tell Clients
Conversation summaries solve context limits but introduce new risks. The summarization model can be influenced to embed injection into persistent summaries.
Validate summary outputs, consider separate summarization contexts, and treat summaries as potentially contaminated model output.
Related Patterns
- Conversation Buffer— raw storage alternative
- Entity Memory— structured extraction from history