docs(llm-security): v7.1.1 — narrative coherence patch

Documents the v7.1.1 narrative-coherence patch in CLAUDE.md (mini-block appended after the v7.0.0 paragraph) and CHANGELOG.md (new [7.1.1] section per Keep a Changelog convention, placed above [7.1.0]). Plan: .claude/plans/ultraplan-2026-04-29-report-coherence.md Brief: .claude/ultraplan-spec-2026-04-29-report-coherence.md Verification gates passed: - npm test: 1522/1522 (was 1511; +11 from new narrative test) - node --test tests/lib/severity.test.mjs: 86/86 (co-monotonicity sweep at lines 252-303 unchanged and green) - node --test tests/scanners/skill-scanner-narrative.test.mjs: 11/11 - Orchestrator against fixture: WARNING / 48 / 1 HIGH (HITL trap caught correctly, no whiplash) - SARIF inline check via toSARIF import: sarif-version 2.1.0, runs: 1 - Zero remaining v1 cutoffs in agent + template Out of scope but flagged for Batch B (deferred to v7.2.0): - commands/scan.md:113-114 retains v1 risk formula Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-29 12:57:54 +02:00 · 2026-04-29 12:57:54 +02:00 · b18cb329ef
commit b18cb329ef
parent 5cfbc70472
2 changed files with 80 additions and 0 deletions
--- a/plugins/llm-security/CHANGELOG.md
+++ b/plugins/llm-security/CHANGELOG.md
@ -4,6 +4,71 @@ All notable changes to the LLM Security Plugin are documented in this file.

 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

+## [7.1.1] - 2026-04-29
+
+Patch release. Closes the narrative-coherence gap that survived v7.0.0:
+the severity-dominated risk score corrected the numbers, but the agent
+prompt continued to emit raw signals and walk them back as
+"false positive" in prose, producing whiplash in the rendered report.
+v7.1.1 makes severity assignment context-first at the prompt level and
+adds a structural counter for suppressed signals.
+
+### Fixed
+
+- **Agent prompt context-first severity** (`agents/skill-scanner-agent.md`).
+  New Step 2.5 mandates that every signal has exactly one disposition —
+  suppressed (counted only) or reported (full finding) — with the split
+  happening before severity is assigned. The phrases "false positive",
+  "legitimate framework", and "no action required" are forbidden in
+  finding-body text and reserved for the new `## Suppressed Signals`
+  section. Verdict Logic section was also updated to reference v2 tiers
+  and cutoffs from `severity.mjs` (BLOCK ≥65, WARNING ≥15) — replaces
+  the stale v1 sum-and-cap formula that had been left in place after
+  the v7.0.0 numeric overhaul.
+- **Template v1 → v2 risk constants** (`templates/unified-report.md`).
+  HTML-comment header at lines 55-66 now describes the v2 tiers and
+  cutoffs the engine has been using since v7.0.0. Adds an
+  `### Narrative Audit` block inside Executive Summary surfacing
+  `summary.narrative_audit.suppressed_findings.{count, by_category}` for
+  reviewer transparency. The block does NOT affect verdict computation.
+
+### Added
+
+- **`tests/scanners/skill-scanner-narrative.test.mjs`** — 11 assertions
+  against `tests/fixtures/skill-scan/hyperframes-like/`. Covers
+  deterministic content-extractor (exactly 1 HIGH HITL trap, ≥ 2
+  framework env-var refs, has_injection true on any signal,
+  has_critical_injection false), entropy scanner (calibration block
+  present, ≤ 1 finding after suppression), inline co-monotonicity
+  guard (`{ high: 1 }` → WARNING / High), and prompt-contract static
+  assertions on `agents/skill-scanner-agent.md` and
+  `templates/unified-report.md`.
+- **`tests/fixtures/skill-scan/hyperframes-like/`** — synthetic skill
+  with HTML5 canvas / CSS keyframes / inline SVG data URI noise plus
+  exactly one genuine HITL trap signal. Committed (not gitignored).
+  `.llm-security-ignore` uses the canonical `SCANNER:glob` format
+  (`ENT:**/*.md`).
+
+### Tests
+
+- 1511 → 1522 tests (adds 11 new). Co-monotonicity sweep at
+  `tests/lib/severity.test.mjs:252-303` unchanged and green.
+
+### Why
+
+Hyperframes.com re-test on 2026-04-19 produced `risk_score 20 / WARNING /
+1 HIGH` numerically (correct after v7.0.0) but the agent listed 8
+findings in prose and walked 6 back as "false positive". v7.1.1 closes
+the structural gap that allowed this: severity is assigned ONCE,
+context-first, and suppressed signals are categorical telemetry rather
+than free-text walk-backs.
+
+### Out of scope (flagged for Batch B)
+
+- `commands/scan.md:113-114` retains the v1 risk formula and acts as a
+  third source of truth alongside agent prompt and severity.mjs. Will
+  be unified in v7.2.0.
+
 ## [7.1.0] - 2026-04-29

 Patch release closing the highest-impact items from the v7.0.0 adversarial review
--- a/plugins/llm-security/CLAUDE.md
+++ b/plugins/llm-security/CLAUDE.md
@ -10,6 +10,21 @@ Security scanning, auditing, and threat modeling for Claude Code projects. 5 fra

 See `docs/security-hardening-guide.md` §6 for the calibration story.

+**v7.1.1 — Scan-rapport narrative coherence (patch).** Three coordinated
+edits address the whiplash symptom that survived v7.0.0 (numbers fixed,
+narrative still walked findings back as "false positive" in prose):
+(a) `agents/skill-scanner-agent.md` Step 2.5 mandates context-first
+severity assignment — every signal has exactly one disposition (suppressed
+OR reported), no per-finding walk-back; (b) `templates/unified-report.md`
+gains a `### Narrative Audit` block in Executive Summary surfacing
+`summary.narrative_audit.suppressed_findings.{count, by_category}` from
+the agent's trailing JSON; (c) both files updated from stale v1
+risk-formula constants to the v2 model that has been authoritative in
+`severity.mjs` since v7.0.0. Counter is distinct from the existing
+top-level `output.suppressed` (`.llm-security-ignore` rule integer).
+Out-of-scope but flagged: `commands/scan.md:113-114` retains the v1
+formula; resolution deferred to Batch B.
+
 ## Commands

 | Command | Description |