Security Is the Weakest Link
Why multiple LLMs collaborating on responses inherit the vulnerabilities of all of them
The Conventional Framing
Mixture of Agents uses multiple LLMs to collaborate on generating a single response. Different models contribute perspectives, outputs are aggregated or synthesized, and the final response benefits from diverse model capabilities.
The pattern is positioned as getting the best of multiple models—combining strengths while compensating for individual weaknesses.
Why This Compounds Vulnerabilities
When multiple models contribute to a response, the security posture is determined by the weakest model. If any model in the mixture is vulnerable to a particular injection, that injection can affect the final output.
You're not getting the most secure model's protection—you're getting the least secure model's vulnerabilities.
Why mixtures multiply risk:
- Weakest link security. An attacker only needs to compromise one model to influence the output.
- Aggregation amplifies. If a poisoned response gets aggregated with clean responses, the poison may still propagate.
- Diverse vulnerabilities. Different models have different injection techniques that work. More models means more attack vectors.
Architecture
Components:
- Input distributor— sends query to multiple models
- Participating models— different LLMs contributing
- Aggregator— combines model outputs into final response
- Weighting logic— how much each model influences result
Trust Boundaries
- Query → Each model — same injection reaches all models
- Model outputs → Aggregator — poisoned output enters aggregation
- Aggregator → Final output — aggregation may preserve poison
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Weakest link exploitation | Target injection at most vulnerable model | Compromise propagates through aggregation |
| Aggregation manipulation | Craft output that dominates aggregation | One model's output overweights others |
| Model-specific attacks | Different injection for each model type | Attack multiple models simultaneously |
| Consistency attacks | Make all models agree on malicious output | Bypass voting or consensus defenses |
The ZIVIS Position
- •Security is minimum, not maximum.The mixture is only as secure as its least secure component. Adding models adds attack surface, not defense.
- •Aggregation doesn't sanitize.Combining outputs doesn't remove injections. Poisoned content from one model can propagate through aggregation.
- •Prefer homogeneous security.If you must mix models, ensure all have similar security properties. A weak model negates the security of strong ones.
- •Validate aggregated output.The final output should go through security validation regardless of how many models contributed to it.
What We Tell Clients
Mixture of Agents optimizes for capability diversity, not security. Every model you add is another potential entry point for attacks.
If security matters, use the most secure single model you have. If you must mix, ensure all participating models have equivalent security properties, and validate the aggregated output as rigorously as any single-model output.
Related Patterns
- Multi-Agent Orchestration— agents with different roles vs. models contributing to same output
- Self-Consistency— multiple paths with same model, different security profile
- Output Validation— should validate aggregated output