docs(llm-security): v7.1.1 — narrative coherence patch

Documents the v7.1.1 narrative-coherence patch in CLAUDE.md (mini-block
appended after the v7.0.0 paragraph) and CHANGELOG.md (new [7.1.1]
section per Keep a Changelog convention, placed above [7.1.0]).

Plan: .claude/plans/ultraplan-2026-04-29-report-coherence.md
Brief: .claude/ultraplan-spec-2026-04-29-report-coherence.md

Verification gates passed:
- npm test: 1522/1522 (was 1511; +11 from new narrative test)
- node --test tests/lib/severity.test.mjs: 86/86 (co-monotonicity sweep
  at lines 252-303 unchanged and green)
- node --test tests/scanners/skill-scanner-narrative.test.mjs: 11/11
- Orchestrator against fixture: WARNING / 48 / 1 HIGH (HITL trap caught
  correctly, no whiplash)
- SARIF inline check via toSARIF import: sarif-version 2.1.0, runs: 1
- Zero remaining v1 cutoffs in agent + template

Out of scope but flagged for Batch B (deferred to v7.2.0):
- commands/scan.md:113-114 retains v1 risk formula

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-04-29 12:57:54 +02:00
commit b18cb329ef
2 changed files with 80 additions and 0 deletions

View file

@ -4,6 +4,71 @@ All notable changes to the LLM Security Plugin are documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [7.1.1] - 2026-04-29
Patch release. Closes the narrative-coherence gap that survived v7.0.0:
the severity-dominated risk score corrected the numbers, but the agent
prompt continued to emit raw signals and walk them back as
"false positive" in prose, producing whiplash in the rendered report.
v7.1.1 makes severity assignment context-first at the prompt level and
adds a structural counter for suppressed signals.
### Fixed
- **Agent prompt context-first severity** (`agents/skill-scanner-agent.md`).
New Step 2.5 mandates that every signal has exactly one disposition —
suppressed (counted only) or reported (full finding) — with the split
happening before severity is assigned. The phrases "false positive",
"legitimate framework", and "no action required" are forbidden in
finding-body text and reserved for the new `## Suppressed Signals`
section. Verdict Logic section was also updated to reference v2 tiers
and cutoffs from `severity.mjs` (BLOCK ≥65, WARNING ≥15) — replaces
the stale v1 sum-and-cap formula that had been left in place after
the v7.0.0 numeric overhaul.
- **Template v1 → v2 risk constants** (`templates/unified-report.md`).
HTML-comment header at lines 55-66 now describes the v2 tiers and
cutoffs the engine has been using since v7.0.0. Adds an
`### Narrative Audit` block inside Executive Summary surfacing
`summary.narrative_audit.suppressed_findings.{count, by_category}` for
reviewer transparency. The block does NOT affect verdict computation.
### Added
- **`tests/scanners/skill-scanner-narrative.test.mjs`** — 11 assertions
against `tests/fixtures/skill-scan/hyperframes-like/`. Covers
deterministic content-extractor (exactly 1 HIGH HITL trap, ≥ 2
framework env-var refs, has_injection true on any signal,
has_critical_injection false), entropy scanner (calibration block
present, ≤ 1 finding after suppression), inline co-monotonicity
guard (`{ high: 1 }` → WARNING / High), and prompt-contract static
assertions on `agents/skill-scanner-agent.md` and
`templates/unified-report.md`.
- **`tests/fixtures/skill-scan/hyperframes-like/`** — synthetic skill
with HTML5 canvas / CSS keyframes / inline SVG data URI noise plus
exactly one genuine HITL trap signal. Committed (not gitignored).
`.llm-security-ignore` uses the canonical `SCANNER:glob` format
(`ENT:**/*.md`).
### Tests
- 1511 → 1522 tests (adds 11 new). Co-monotonicity sweep at
`tests/lib/severity.test.mjs:252-303` unchanged and green.
### Why
Hyperframes.com re-test on 2026-04-19 produced `risk_score 20 / WARNING /
1 HIGH` numerically (correct after v7.0.0) but the agent listed 8
findings in prose and walked 6 back as "false positive". v7.1.1 closes
the structural gap that allowed this: severity is assigned ONCE,
context-first, and suppressed signals are categorical telemetry rather
than free-text walk-backs.
### Out of scope (flagged for Batch B)
- `commands/scan.md:113-114` retains the v1 risk formula and acts as a
third source of truth alongside agent prompt and severity.mjs. Will
be unified in v7.2.0.
## [7.1.0] - 2026-04-29
Patch release closing the highest-impact items from the v7.0.0 adversarial review

View file

@ -10,6 +10,21 @@ Security scanning, auditing, and threat modeling for Claude Code projects. 5 fra
See `docs/security-hardening-guide.md` §6 for the calibration story.
**v7.1.1 — Scan-rapport narrative coherence (patch).** Three coordinated
edits address the whiplash symptom that survived v7.0.0 (numbers fixed,
narrative still walked findings back as "false positive" in prose):
(a) `agents/skill-scanner-agent.md` Step 2.5 mandates context-first
severity assignment — every signal has exactly one disposition (suppressed
OR reported), no per-finding walk-back; (b) `templates/unified-report.md`
gains a `### Narrative Audit` block in Executive Summary surfacing
`summary.narrative_audit.suppressed_findings.{count, by_category}` from
the agent's trailing JSON; (c) both files updated from stale v1
risk-formula constants to the v2 model that has been authoritative in
`severity.mjs` since v7.0.0. Counter is distinct from the existing
top-level `output.suppressed` (`.llm-security-ignore` rule integer).
Out-of-scope but flagged: `commands/scan.md:113-114` retains the v1
formula; resolution deferred to Batch B.
## Commands
| Command | Description |