feat(llm-security): add toxic-agent-demo example for TFA scanner [skip-docs]

Single-component lethal-trifecta walkthrough that drives scanners/toxic-flow-analyzer.mjs against a deliberately misconfigured fixture plugin. The fixture agent declares tools: [Bash, Read, WebFetch], which alone covers all three trifecta legs (input surface + data access + exfil sink). No hooks/hooks.json is shipped, so TFA's mitigation logic finds no active guards and emits a CRITICAL "Lethal trifecta:" finding without downgrade. Plugin marker is plugin.fixture.json (recognised by isPlugin()) rather than .claude-plugin/plugin.json — the latter is blocked by the plugin's own pre-write-pathguard hook, and plugin.fixture.json exists in isPlugin() specifically so example fixtures can self-mark without touching guarded paths. Three independent assertions (3/3 must pass): direct trifecta present and CRITICAL; finding mentions the exfil-helper component; description confirms "no hook guards detected" (proves the mitigation path stayed inactive). expected-findings.md documents the contract. OWASP / framework mapping: ASI01, ASI02, ASI05, LLM01, LLM02, LLM06. Docs updated: plugin README "Other runnable examples", plugin CLAUDE.md "Examples" tabellen, CHANGELOG [Unreleased] Added. [skip-docs] is appropriate because examples don't change what the plugin "synes å dekke utad" — marketplace root README is unaffected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 15:15:04 +02:00 · 2026-05-05 15:15:04 +02:00 · 92fb0087fa
commit 92fb0087fa
parent 15607b182e
8 changed files with 422 additions and 0 deletions
--- a/plugins/llm-security/CHANGELOG.md
+++ b/plugins/llm-security/CHANGELOG.md
@ -77,6 +77,24 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
  `tests/e2e/attack-chain.test.mjs` is reused so the run-script
  source contains no literal destructive command. Maps to LLM06 /
  ASI01 / LLM01.
+- `examples/toxic-agent-demo/` — runnable demonstration of the
+  `toxic-flow-analyzer` (TFA) emitting a CRITICAL single-component
+  lethal-trifecta finding on a fixture plugin. The agent at
+  `fixture/agents/exfil-helper.fixture.md` declares
+  `tools: [Bash, Read, WebFetch]`, which alone covers all three
+  trifecta legs (input surface + data access + exfil sink), and the
+  fixture omits `hooks/hooks.json` so TFA's mitigation logic finds
+  no active guards and keeps severity at CRITICAL. The plugin marker
+  is `plugin.fixture.json` (recognised by `isPlugin()`) rather than
+  `.claude-plugin/plugin.json`, because the latter is blocked by the
+  plugin's own `pre-write-pathguard` hook — `plugin.fixture.json`
+  exists in `isPlugin()` specifically so example fixtures can
+  self-mark without touching guarded paths. The walkthrough invokes
+  `scan(targetPath, discovery, {})` with no `priorResults`, so the
+  classification comes from frontmatter + tool/keyword sets only;
+  the orchestrated `scan-orchestrator.mjs` flow exercises the
+  `enrichFromPriorResults()` pass that this example deliberately
+  skips. Maps to ASI01 / ASI02 / ASI05 / LLM01 / LLM02 / LLM06.

 ## [7.3.1] - 2026-05-01