Single-component lethal-trifecta walkthrough that drives scanners/toxic-flow-analyzer.mjs against a deliberately misconfigured fixture plugin. The fixture agent declares tools: [Bash, Read, WebFetch], which alone covers all three trifecta legs (input surface + data access + exfil sink). No hooks/hooks.json is shipped, so TFA's mitigation logic finds no active guards and emits a CRITICAL "Lethal trifecta:" finding without downgrade. Plugin marker is plugin.fixture.json (recognised by isPlugin()) rather than .claude-plugin/plugin.json — the latter is blocked by the plugin's own pre-write-pathguard hook, and plugin.fixture.json exists in isPlugin() specifically so example fixtures can self-mark without touching guarded paths. Three independent assertions (3/3 must pass): direct trifecta present and CRITICAL; finding mentions the exfil-helper component; description confirms "no hook guards detected" (proves the mitigation path stayed inactive). expected-findings.md documents the contract. OWASP / framework mapping: ASI01, ASI02, ASI05, LLM01, LLM02, LLM06. Docs updated: plugin README "Other runnable examples", plugin CLAUDE.md "Examples" tabellen, CHANGELOG [Unreleased] Added. [skip-docs] is appropriate because examples don't change what the plugin "synes å dekke utad" — marketplace root README is unaffected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
6.2 KiB
Toxic-Flow Walkthrough — Single-Component Lethal Trifecta
WARNING: This is a demonstration fixture, NOT a real attack. The fixture agent is deliberately misconfigured. It is never loaded by Claude Code — the run script only feeds the directory to the deterministic scanner.
What this demonstrates
scanners/toxic-flow-analyzer.mjs (TFA scanner) detects lethal
trifecta patterns at the plugin component level. Where every
other scanner in this plugin looks at file content, TFA looks at
capability combinations: which agents/commands/skills hold which
tools, and which keywords or prior-scanner findings light up which
of the three trifecta legs.
The lethal trifecta (Willison / Invariant Labs):
- Untrusted input surface — the component is exposed to data
an attacker can control (Bash stdin, MCP output,
$ARGUMENTS, remote URLs, …). - Sensitive data access — the component can read project
secrets (
Read,Glob,Grep,Bash-via-cat, …). - Exfiltration sink — the component can move data out of
the process boundary (
WebFetch,Bash-via-curl, sub-agent delegation, …).
When all three meet in a single component and no hook guards
are active, TFA emits a CRITICAL Lethal trifecta: finding. With
guards present, severity downgrades to HIGH or MEDIUM.
Fixture layout
examples/toxic-agent-demo/
fixture/
plugin.fixture.json # plugin marker (recognised by
# toxic-flow-analyzer.isPlugin())
agents/
exfil-helper.fixture.md # tools: [Bash, Read, WebFetch]
# - description names "untrusted user input" + "remote URL"
# - body lists .env / ~/.aws / keychain / secret
# - body references webhook / upload / curl --data
README.md # this file
run-toxic-flow.mjs # walkthrough runner
expected-findings.md # testable contract
The plugin marker is plugin.fixture.json (not .claude-plugin/plugin.json)
because the plugin's own pre-write-pathguard.mjs hook blocks all
writes inside .claude-plugin/ — plugin.fixture.json is a
sentinel file toxic-flow-analyzer.isPlugin() recognises
specifically so example fixtures can mark themselves as plugins
without touching guarded paths.
The fixture deliberately has no hooks/hooks.json, so TFA's
mitigation logic finds neither an exfil guard
(pre-bash-destructive / post-mcp-verify /
pre-install-supply-chain) nor an input guard
(pre-prompt-inject-scan) and keeps the finding at CRITICAL.
How to run
cd plugins/llm-security
node examples/toxic-agent-demo/run-toxic-flow.mjs
# Verbose — full per-finding listing with evidence string
node examples/toxic-agent-demo/run-toxic-flow.mjs --verbose
Expected: 3 pass, 0 fail with 1 CRITICAL Lethal trifecta: exfil-helper (agent) finding.
Scanner involved
scanners/toxic-flow-analyzer.mjs— invoked directly viaimport { scan }. Takes(targetPath, discovery, priorResults). In this walkthroughpriorResultsis{}(no upstream scanners) so the trifecta is detected from frontmatter + keywords alone. In the orchestrated form (scan-orchestrator.mjs), TFA runs LAST and consumes findings from all 9 prior scanners (UNI, ENT, PRM, DEP, TNT, GIT, NET, MEM, SCR), which can promote classifications via the enrichment pass inenrichFromPriorResults().
Why TFA is special
Other scanners detect dangerous content. TFA detects dangerous
architecture — combinations that no individual file would trip,
but that together complete an exfiltration chain. A plugin can be
clean by every per-file check and still ship a single agent that
holds Bash + Read + WebFetch, in which case one prompt-injection
chain on that agent reads .env and uploads it.
This is a defense-in-depth complement to:
| Layer | What it covers |
|---|---|
permission-mapper |
Excessive-tool advisories per component |
taint-tracer |
LLM01/LLM02 in code paths |
pre-prompt-inject-scan |
Runtime injection in user prompts |
post-session-guard |
Runtime trifecta across tool calls (Rule of Two) |
toxic-flow-analyzer |
Capability combinations across plugin surface |
post-session-guard is the runtime sibling of TFA — see
examples/lethal-trifecta-walkthrough/ for the runtime view of
the same trifecta concept.
OWASP / framework mapping
| Code | Framework | Why |
|---|---|---|
| ASI01 | OWASP Agentic Top 10 | Memory / tool poisoning leading to action |
| ASI02 | OWASP Agentic Top 10 | Tool misuse via excess capability |
| ASI05 | OWASP Agentic Top 10 | Cascading hallucination / chained capability |
| LLM01 | OWASP LLM Top 10 (2025) | Prompt injection feeds the input leg |
| LLM02 | OWASP LLM Top 10 (2025) | Sensitive information disclosure on data-leg activation |
| LLM06 | OWASP LLM Top 10 (2025) | Excessive Agency — too many tools on one component |
| MCP1 | OWASP MCP Top 10 | MCP-borne untrusted input strengthens leg 1 (not exercised in this fixture) |
| MCP3 | OWASP MCP Top 10 | MCP-borne data-access likewise (not exercised) |
Limitations
- The fixture exercises TFA in isolation (
priorResults = {}). The orchestratedscan-orchestrator.mjsflow runs TFA after 9 other scanners and may classify additional legs via the enrichment pass — leading to more findings or higher severity on real plugins than this minimal example shows. - TFA's keyword + tool sets are fixed. A novel exfil verb that doesn't match the keyword list would not light up the leg-3 flag without a confirming prior-scanner finding.
- TFA only runs on plugin-shaped targets (per
isPlugin()). Standalone scripts and non-plugin repos are skipped — TFA is meant to audit the plugin attack surface, not arbitrary code.
See also
scanners/toxic-flow-analyzer.mjs— scanner sourcetests/lib/toxic-flow-analyzer.test.mjs— unit-test contractexamples/lethal-trifecta-walkthrough/— runtime trifecta (post-session-guard, Rule of Two, sliding window)knowledge/owasp-agentic-top10.md— ASI01 / ASI02 / ASI05 backgroundexpected-findings.md(in this folder) — the testable contract