Testing What You Think Of, Missing What You Don't

Why adversarial testing finds known attack classes but misses novel techniques

The Conventional Framing

Red teaming involves adversarial testing of LLM systems—attempting to find vulnerabilities before attackers do. Teams try to jailbreak, inject, and bypass controls using known techniques and creative exploration.

The pattern is essential for validating security before deployment.

Why Red Teaming Has Coverage Gaps

Red teams test what they know to test. They explore known attack categories, documented bypasses, and variations on established techniques. Novel attack classes that nobody has discovered yet don't get tested.

The attack space is effectively infinite. A red team can't achieve complete coverage—they can only demonstrate that specific attacks they tried didn't work (or did).

The unknown unknowns:

Red teaming proves the presence of vulnerabilities, not their absence. Passing red team testing means "our team didn't find a way in," not "there is no way in."

Architecture

Components:

  • Attack catalogknown techniques to test
  • Creative explorationfinding new variations
  • Coverage trackingwhat's been tested
  • Vulnerability assessmentevaluating findings

Trust Boundaries

Red team tested: ✓ Direct instruction injection ✓ Role-play jailbreaks ✓ Encoding bypasses ✓ Multi-turn attacks ✓ Known prompt leak techniques Not tested (unknown at time): ? Novel encoding scheme ? New multi-model attack ? Capability combination nobody tried ? Social engineering + technical hybrid "Passed red team" ≠ "Secure"
  1. Knowledge → Teststests limited by team knowledge
  2. Time → Coveragelimited time means limited coverage
  3. Results → Confidencepassing tests isn't proof of security

Threat Surface

ThreatVectorImpact
Coverage gapsAttacks not in the test catalogNovel attacks succeed despite red teaming
Time pressureRed team has limited time, attacker has unlimitedAttacker finds what team didn't have time for
Test environment differencesProduction differs from test environmentAttacks work in production that didn't work in test
Evolving attacksNew techniques developed after testingSystem vulnerable to attacks that didn't exist during testing

The ZIVIS Position

  • Red teaming is necessary but not sufficient.Do red teaming. But understand it proves presence of tested vulnerabilities, not absence of all vulnerabilities.
  • Continuous, not one-time.The attack landscape evolves. Red teaming should be ongoing, incorporating new techniques as they emerge.
  • Combine with other defenses.Red teaming finds holes in your defenses. But you need the defenses too. Testing doesn't replace controls.
  • Value the misses.When red team finds something, that's valuable—fix it. But also appreciate that they may have missed other things.

What We Tell Clients

Red teaming is essential for finding vulnerabilities before attackers do. But passing red team testing doesn't mean your system is secure—it means your team didn't find a way in.

Conduct red teaming regularly, update attack catalogs continuously, and combine testing with defense in depth. The goal is raising the bar, not achieving perfect security.

Related Patterns