Voting Among Compromised Reasoning Paths
Why sampling multiple reasoning chains doesn't help when all chains see the same poison
The Conventional Framing
Self-consistency samples multiple reasoning paths and takes a majority vote on the final answer. Different reasoning chains may make different errors, but the correct answer should be most common.
The pattern improves reliability by reducing variance in model outputs.
Why Voting Doesn't Help Against Shared Context
All reasoning paths share the same context. If that context contains an injection, all paths reason about the same injection. You're not getting independent votes—you're getting multiple attempts to process the same adversarial input.
Self-consistency addresses random variance. Injections aren't random. They're systematic manipulations that affect all paths similarly.
The correlation problem:
Effective injections work by being compelling to the model. A compelling injection influences most or all reasoning paths the same way. The vote is correlated, not independent.
Architecture
Components:
- Shared context— same input for all paths
- Multiple sampling— generate several reasoning chains
- Answer extraction— get conclusion from each chain
- Voting mechanism— majority or weighted aggregation
Trust Boundaries
- Context → All paths — injection reaches every path
- Paths → Votes — correlated errors produce correlated votes
- Vote → Output — majority wrong answer selected
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Correlated failure | Injection affects all reasoning paths similarly | Majority vote doesn't filter out systematic errors |
| Increased attack surface | Multiple paths = multiple processing of injection | Higher chance at least one path fully executes attack |
| False confidence | Agreement among paths suggests correctness | High consistency in wrong answer looks like validation |
The ZIVIS Position
- •Independence requires independent context.For voting to work against attacks, each path would need different context. Same context = correlated vulnerability.
- •Self-consistency is for noise, not adversaries.The pattern handles random model variance. Attacks are not random variance. Different threat model.
- •High agreement can indicate compromise.If an injection is effective, all paths agree on the wrong answer. Suspiciously high consistency might be a red flag.
What We Tell Clients
Self-consistency improves reliability against random errors but doesn't help against adversarial inputs. All reasoning paths share the same context and thus the same vulnerabilities.
If you need defense against injection, you need paths with different context—not just different random samples. Consider isolated processing or external validation rather than internal voting.
Related Patterns
- Tree of Thoughts— branching exploration with same issues
- Reflection— self-critique has similar limitations