test(llm-security): add e2e suite proving framework works as coordinated system

Three new files in tests/e2e/ (45 tests, 1777 -> 1822): - attack-chain.test.mjs (17): full hook stack against attack payloads in sequence -- prompt injection at the gate; T1/T5/T8 bash evasions; pathguard on .env / .ssh; secrets hook on AWS-shaped keys and PEM headers; markdown link-title and HTML-comment poisoning in tool output; trifecta accumulation over a single session with dedup on the next benign call. - multi-session.test.mjs (9): state persistence across simulated session boundaries. Uses the fact that a hook child's process.ppid equals the test runner's process.pid, so writing the session state file directly simulates "previous session" history. Covers slow-burn trifecta (legs spread >50 calls), MCP cumulative description drift via LLM_SECURITY_MCP_CACHE_FILE override, and pre-compact transcript poisoning in warn / block / clean / missing-file modes. - scan-pipeline.test.mjs (19): scan-orchestrator + all 10 scanners + toxic-flow correlator against poisoned-project (BLOCK / 95 / Extreme) and grade-a-project (WARNING / 48 / High). Asserts envelope shape, verdict, risk_score, severity counts, OWASP coverage, scanner enumeration, and a narrative-coherence cross-check that the BLOCK scan strictly outranks the WARNING scan along every axis. Test files build credential-shaped payloads at runtime via concatenation so they contain no literal matches for the pre-edit-secrets regexes (memory rule feedback_secrets_hook_test_fixtures.md). Doc updates in same commit per marketplace policy: - CLAUDE.md header: 1777+ -> 1822+ tests, mentions tests/e2e/ - README.md badge tests-1777 -> tests-1822, body text updated - CHANGELOG.md: new [Unreleased] Added section describing scope No version bump. No behavior changes outside tests/.
2026-05-05 12:06:57 +02:00 · 2026-05-05 12:06:57 +02:00 · f835777c1e
commit f835777c1e
parent a7a334c8d1
6 changed files with 974 additions and 3 deletions
--- a/plugins/llm-security/CHANGELOG.md
+++ b/plugins/llm-security/CHANGELOG.md
@ -4,6 +4,32 @@ All notable changes to the LLM Security Plugin are documented in this file.

 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

+## [Unreleased]
+
+### Added
+
+- `tests/e2e/` — three dedicated end-to-end suites that prove the framework
+  works as a coordinated system, not just as isolated units:
+  - `attack-chain.test.mjs` (17 tests) — full hook stack against attack
+    payloads in sequence: prompt injection at the gate; T1/T5/T8 bash
+    evasion; pathguard on `.env`/`.ssh`; secrets hook on AWS-shaped keys
+    and PEM headers; markdown link-title and HTML-comment poisoning in
+    tool output; trifecta accumulation over a single session.
+  - `multi-session.test.mjs` (9 tests) — state persistence across
+    simulated session boundaries: slow-burn trifecta with legs spread
+    over 50+ calls; MCP cumulative description drift across small
+    per-update changes that each fall under the 10% threshold but
+    cumulatively cross 25% from baseline; pre-compact-scan blocking
+    poisoned transcripts in block mode.
+  - `scan-pipeline.test.mjs` (19 tests) — orchestrator + all 10 scanners
+    + toxic-flow correlator against the `poisoned-project` and
+    `grade-a-project` fixtures: verdict, risk_score, risk_band, severity
+    counts, OWASP coverage, scanner enumeration, and a narrative-coherence
+    cross-check that BLOCK is genuinely worse than WARNING along every axis.
+- Test count: 1777 → 1822 (+45). All payloads matching credential regexes
+  are assembled at runtime via concatenation, so test files contain no
+  literal credential-shaped strings (compatible with `pre-edit-secrets`).
+
 ## [7.3.1] - 2026-05-01

 Stabilization patch. No behavior changes. Sets the public stance, tightens