AI Penetration Testing

Conventional pen tests check syntax.
AI pen tests check semantics.

Most pen-test firms still test AI applications as if they were web applications. That misses the entire attack surface modern AI introduces — the layer where meaning is interpreted and where autonomous agents take action on the world.

Book 30 minutes with Jim and Jake

The shift is linguistic.

Software has rules. AI has intent. The two demand different threat models.

CONVENTIONAL PEN TEST

Syntax.

Does the code parse? Are inputs sanitized? Is auth enforced at every route? Are dependencies up to date? Does the API return what the docs say it does?

A rule-checker. A grammar-checker. The right questions for the last twenty years of web and API.

AI PEN TEST

Semantics.

Does the AI understand the user’s intent — or can an adversary make it understand something else? Can meaning be hijacked through context, tone, instruction order, or retrieved content the AI didn’t know it shouldn’t trust?

A meaning-checker. The questions modern AI applications actually require.

The Agent Trust Protocol Stack (ATPS)

Ten layers. Ten attack surfaces.

ZIVIS’s mental model for thinking about modern AI applications — the same model behind our open protocol work currently under IETF review. Each layer has a distinct threat model. A pen test that only covers the bottom two is missing most of the surface.

LAYER L9

Observability

Can you reconstruct what happened across systems? Logs, traces, audit trails, and the evidence you'll hand a reviewer six months from now.

REPRESENTATIVE THREATS

Audit log evasionRepudiation & untraceabilityTelemetry tamperingTrace poisoningCross-system correlation gaps

LAYER L8

Governance / Human Control

Policy, approvals, escalation, human review. The mechanisms that decide which actions need a human in the loop — and what happens when they're skipped.

REPRESENTATIVE THREATS

Approval bypassHITL fatigue / overloadEscalation evasionPolicy misalignmentAudit blindspots

LAYER L7

Authority

Who is the AI allowed to act as? Identity, OAuth scopes, capability tokens, and the trust boundary between the AI and the systems it touches.

REPRESENTATIVE THREATS

Privilege escalationIdentity spoofing & impersonationOAuth / scope confusionCapability transfer abuse

LAYER L6

Agency

What can the AI actually do in the world? Tools, APIs, side effects, multi-step plans, multi-agent collaboration.

REPRESENTATIVE THREATS

Excessive agencyTool misuseMulti-step / chained exploitationRogue agents in multi-agent systems

LAYER L5

Runtime / Execution

Agent loops, orchestration, retries, routing, tool calls. The plumbing that turns a model output into an action — and the place where one bad signal can cascade.

REPRESENTATIVE THREATS

Loop exhaustion / retry stormsRouting manipulationTool-call injectionOrchestration hijackingStep skipping

LAYER L4

Memory

What does the AI carry across turns and sessions? Conversation state, vector stores, learned preferences, persistent context.

REPRESENTATIVE THREATS

Memory poisoningCross-session leakageVector and embedding weaknessesPersistent state corruption

LAYER L3

Meaning

Interpretation, intent, semantic manipulation. Where attackers tunnel through meaning to make the system understand something it shouldn't.

REPRESENTATIVE THREATS

Direct prompt injectionIndirect / cross-tenant prompt injectionInstruction overrideIntent breaking & goal manipulation

LAYER L2

Reasoning

Model inference, decision boundaries, planning. The logic the model applies to context — and where it can be steered into the wrong conclusion.

REPRESENTATIVE THREATS

Adversarial inputsChain-of-thought hijackingCascading hallucination attacksModel extraction

LAYER L1

Context

Prompts, RAG, tool descriptions, input data. Everything the model treats as ground truth before it reasons.

REPRESENTATIVE THREATS

RAG / retrieval injectionSystem prompt leakageUntrusted document ingestionTool-description poisoning

LAYER L0

Supply Chain / Provenance

Models, datasets, embeddings, evals, dependencies. The foundation everything else inherits — if it's compromised, every layer above is, too.

REPRESENTATIVE THREATS

Model supply-chain compromiseTraining data tamperingBackdoored model weightsEvaluator poisoningDependency injection

Read ATPS bottom-up. Each layer assumes the integrity of the ones beneath it. Compromise the supply chain — and the model lies before you ask. Compromise meaning — and the system acts on the wrong intent. Compromise governance — and observability becomes theater. An attacker only needs to compromise one layer.

Coverage your reviewer will actually recognize.

We test every category in OWASP’s LLM Top 10. We go deeper into the Agentic AI Top 10 — because that’s where the modern attack surface is actually growing.

OWASP LLM TOP 10

LLM01

Prompt Injection

LLM02

Sensitive Information Disclosure

LLM03

Supply Chain

LLM04

Data and Model Poisoning

LLM05

Improper Output Handling

LLM06

Excessive Agency

LLM07

System Prompt Leakage

LLM08

Vector and Embedding Weaknesses

LLM09

Misinformation

LLM10

Unbounded Consumption

OWASP AGENTIC AI THREATS

WHERE WE GO DEEPER

Memory Poisoning

Tool Misuse

Privilege Compromise

Resource Overload

Cascading Hallucination Attacks

Intent Breaking & Goal Manipulation

Misaligned & Deceptive Behaviors

Repudiation & Untraceability

Identity Spoofing & Impersonation

T10

Overwhelming Human-in-the-Loop

T11

Unexpected RCE & Code Attacks

T12

Agent Communication Poisoning

T13

Rogue Agents in Multi-Agent Systems

T14

Human Attacks on Multi-Agent Systems

T15

Human Manipulation

Engineers who build the exploit. Not consultants who write the report.

Most pen-test firms ship a PDF that lists vulnerabilities. We ship working exploit code that proves what an attacker could actually do.

Jake Miller — Co-Founder & CEO — personally leads engagements. 25 years engineering complex enterprise systems (first engineer on Salesforce Journey Builder) gives him an unusual angle on offensive AI security: he attacks the way someone who has actually built the system would attack it.

Backed by ZIVIS’s proprietary platform: 120+ adversarial AI attack scenarios, retest evidence captured for every finding, continuous coverage as your AI program evolves.

A pen test alone won’t clear an enterprise security review.

It tells you what’s wrong. It doesn’t answer the questions a Salesforce-style reviewer is going to ask — about your threat model, your remediation cycle, your evidence trail. That’s why our pen testing slots into a four-phase engagement: Diagnosis, Treatment Plan, Remediation & Retest, In the Room. See the full process.

Book 30 minutes with Jim and Jake

One CISO with 30+ years across enterprise security. One offensive engineer with 25 years finding what scanners miss. One conversation about the deal at risk.

Your message goes directly to

Jim Goldman

Co-Founder & CISO

30+ yrs cybersecurity. Ex-Salesforce VP Enterprise Security. FBI Cyber Crime TFO.

Jake Miller

Co-Founder & CEO

25+ yrs building secure enterprise systems. First engineer on Salesforce Journey Builder.

Conventional pen tests check syntax.AI pen tests check semantics.

The shift is linguistic.

Ten layers. Ten attack surfaces.

Coverage your reviewer will actually recognize.

Engineers who build the exploit. Not consultants who write the report.

Book 30 minutes with Jim and Jake

Conventional pen tests check syntax.
AI pen tests check semantics.