Jump to pattern

Testing What You Think Of, Missing What You Don't

Why adversarial testing finds known attack classes but misses novel techniques

The Conventional Framing

Red teaming involves adversarial testing of LLM systems—attempting to find vulnerabilities before attackers do. Teams try to jailbreak, inject, and bypass controls using known techniques and creative exploration.

The pattern is essential for validating security before deployment.

Why Red Teaming Has Coverage Gaps

Red teams test what they know to test. They explore known attack categories, documented bypasses, and variations on established techniques. Novel attack classes that nobody has discovered yet don't get tested.

The attack space is effectively infinite. A red team can't achieve complete coverage—they can only demonstrate that specific attacks they tried didn't work (or did).

The unknown unknowns:

Red teaming proves the presence of vulnerabilities, not their absence. Passing red team testing means "our team didn't find a way in," not "there is no way in."

Architecture

Components:

Attack catalog— known techniques to test
Creative exploration— finding new variations
Coverage tracking— what's been tested
Vulnerability assessment— evaluating findings

Trust Boundaries

Red team tested: ✓ Direct instruction injection ✓ Role-play jailbreaks ✓ Encoding bypasses ✓ Multi-turn attacks ✓ Known prompt leak techniques Not tested (unknown at time): ? Novel encoding scheme ? New multi-model attack ? Capability combination nobody tried ? Social engineering + technical hybrid "Passed red team" ≠ "Secure"

Knowledge → Tests — tests limited by team knowledge
Time → Coverage — limited time means limited coverage
Results → Confidence — passing tests isn't proof of security

Threat Surface

Threat	Vector	Impact
Coverage gaps	Attacks not in the test catalog	Novel attacks succeed despite red teaming
Time pressure	Red team has limited time, attacker has unlimited	Attacker finds what team didn't have time for
Test environment differences	Production differs from test environment	Attacks work in production that didn't work in test
Evolving attacks	New techniques developed after testing	System vulnerable to attacks that didn't exist during testing

The ZIVIS Position

•
Red teaming is necessary but not sufficient.Do red teaming. But understand it proves presence of tested vulnerabilities, not absence of all vulnerabilities.
•
Continuous, not one-time.The attack landscape evolves. Red teaming should be ongoing, incorporating new techniques as they emerge.
•
Combine with other defenses.Red teaming finds holes in your defenses. But you need the defenses too. Testing doesn't replace controls.
•
Value the misses.When red team finds something, that's valuable—fix it. But also appreciate that they may have missed other things.

What We Tell Clients

Red teaming is essential for finding vulnerabilities before attackers do. But passing red team testing doesn't mean your system is secure—it means your team didn't find a way in.

Conduct red teaming regularly, update attack catalogs continuously, and combine testing with defense in depth. The goal is raising the bar, not achieving perfect security.

Related Patterns

Audit Logging— detecting attacks red team didn't test
Guardrails— defenses that red team tests