Slowing Attacks, Not Stopping Them

Why rate limits make attacks more expensive but don't prevent determined adversaries

The Conventional Framing

Rate limiting restricts how many requests a user can make in a given time period. This prevents abuse, manages costs, and limits the speed at which attacks can be executed.

The pattern is standard for API security and resource management.

Why Slow Attacks Still Succeed

Rate limiting makes brute force attacks take longer. But prompt injection often doesn't need many attempts—a single well-crafted input can compromise the system.

For attacks that do need iteration (finding bypasses, extracting data), rate limits slow the attacker but don't stop them. Patient attackers work within limits.

The single-shot problem:

Many prompt injections work on the first try. Rate limiting a user to 10 requests per minute doesn't help if the first request succeeds.

Architecture

Components:

  • Request countertracks requests per user/IP
  • Time windowperiod over which limits apply
  • Limit thresholdsmax requests per window
  • Enforcementreject or queue excess requests

Trust Boundaries

Rate limit: 60 requests/minute Brute force attack (affected): - Attacker wants to try 10,000 payloads - At 60/min, takes ~2.7 hours - Rate limit helps Prompt injection (not affected): - Attacker crafts one good payload - Succeeds on first request - Rate limit irrelevant Single effective attack defeats rate limiting.
  1. User → Rate limiterrequests counted and limited
  2. Rate limiter → Modelallowed requests proceed
  3. Time → Resetlimits reset periodically

Threat Surface

ThreatVectorImpact
Single-shot attacksCraft one effective injectionRate limit doesn't prevent first request
Distributed attacksSpread requests across many accounts/IPsPer-user limits bypassed with multiple identities
Slow and steadyStay within limits while attackingPatient attacker succeeds over time
Burst before limitFront-load requests before limit kicks inInitial burst may be sufficient for attack

The ZIVIS Position

  • Rate limiting is cost imposition.It makes attacks more expensive in time and resources. It doesn't make attacks impossible.
  • Useful for iteration-heavy attacks.Brute force, data extraction over many requests, bypass discovery—these are slowed by rate limits.
  • Not useful for single-shot attacks.Well-crafted prompt injections often work immediately. Rate limiting the 2nd through Nth requests doesn't help.
  • Combine with other controls.Rate limiting is one layer. It doesn't replace input validation, privilege separation, or monitoring.

What We Tell Clients

Rate limiting is good hygiene and increases attacker cost. But don't rely on it as a security control against prompt injection—many attacks succeed on the first request.

Use rate limiting for resource management and to slow iteration-based attacks. Combine with controls that address single-request attacks.

Related Patterns