Slowing Attacks, Not Stopping Them
Why rate limits make attacks more expensive but don't prevent determined adversaries
The Conventional Framing
Rate limiting restricts how many requests a user can make in a given time period. This prevents abuse, manages costs, and limits the speed at which attacks can be executed.
The pattern is standard for API security and resource management.
Why Slow Attacks Still Succeed
Rate limiting makes brute force attacks take longer. But prompt injection often doesn't need many attempts—a single well-crafted input can compromise the system.
For attacks that do need iteration (finding bypasses, extracting data), rate limits slow the attacker but don't stop them. Patient attackers work within limits.
The single-shot problem:
Many prompt injections work on the first try. Rate limiting a user to 10 requests per minute doesn't help if the first request succeeds.
Architecture
Components:
- Request counter— tracks requests per user/IP
- Time window— period over which limits apply
- Limit thresholds— max requests per window
- Enforcement— reject or queue excess requests
Trust Boundaries
- User → Rate limiter — requests counted and limited
- Rate limiter → Model — allowed requests proceed
- Time → Reset — limits reset periodically
Threat Surface
| Threat | Vector | Impact |
|---|---|---|
| Single-shot attacks | Craft one effective injection | Rate limit doesn't prevent first request |
| Distributed attacks | Spread requests across many accounts/IPs | Per-user limits bypassed with multiple identities |
| Slow and steady | Stay within limits while attacking | Patient attacker succeeds over time |
| Burst before limit | Front-load requests before limit kicks in | Initial burst may be sufficient for attack |
The ZIVIS Position
- •Rate limiting is cost imposition.It makes attacks more expensive in time and resources. It doesn't make attacks impossible.
- •Useful for iteration-heavy attacks.Brute force, data extraction over many requests, bypass discovery—these are slowed by rate limits.
- •Not useful for single-shot attacks.Well-crafted prompt injections often work immediately. Rate limiting the 2nd through Nth requests doesn't help.
- •Combine with other controls.Rate limiting is one layer. It doesn't replace input validation, privilege separation, or monitoring.
What We Tell Clients
Rate limiting is good hygiene and increases attacker cost. But don't rely on it as a security control against prompt injection—many attacks succeed on the first request.
Use rate limiting for resource management and to slow iteration-based attacks. Combine with controls that address single-request attacks.
Related Patterns
- Input Filtering— blocking individual malicious requests
- Audit Logging— detecting patterns across requests