From b18cb329ef43f4ae81a0343880ef6fc2d3ebc65b Mon Sep 17 00:00:00 2001 From: Kjell Tore Guttormsen Date: Wed, 29 Apr 2026 12:57:54 +0200 Subject: [PATCH] =?UTF-8?q?docs(llm-security):=20v7.1.1=20=E2=80=94=20narr?= =?UTF-8?q?ative=20coherence=20patch?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Documents the v7.1.1 narrative-coherence patch in CLAUDE.md (mini-block appended after the v7.0.0 paragraph) and CHANGELOG.md (new [7.1.1] section per Keep a Changelog convention, placed above [7.1.0]). Plan: .claude/plans/ultraplan-2026-04-29-report-coherence.md Brief: .claude/ultraplan-spec-2026-04-29-report-coherence.md Verification gates passed: - npm test: 1522/1522 (was 1511; +11 from new narrative test) - node --test tests/lib/severity.test.mjs: 86/86 (co-monotonicity sweep at lines 252-303 unchanged and green) - node --test tests/scanners/skill-scanner-narrative.test.mjs: 11/11 - Orchestrator against fixture: WARNING / 48 / 1 HIGH (HITL trap caught correctly, no whiplash) - SARIF inline check via toSARIF import: sarif-version 2.1.0, runs: 1 - Zero remaining v1 cutoffs in agent + template Out of scope but flagged for Batch B (deferred to v7.2.0): - commands/scan.md:113-114 retains v1 risk formula Co-Authored-By: Claude Opus 4.7 --- plugins/llm-security/CHANGELOG.md | 65 +++++++++++++++++++++++++++++++ plugins/llm-security/CLAUDE.md | 15 +++++++ 2 files changed, 80 insertions(+) diff --git a/plugins/llm-security/CHANGELOG.md b/plugins/llm-security/CHANGELOG.md index d8afde6..7ee94f6 100644 --- a/plugins/llm-security/CHANGELOG.md +++ b/plugins/llm-security/CHANGELOG.md @@ -4,6 +4,71 @@ All notable changes to the LLM Security Plugin are documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). +## [7.1.1] - 2026-04-29 + +Patch release. Closes the narrative-coherence gap that survived v7.0.0: +the severity-dominated risk score corrected the numbers, but the agent +prompt continued to emit raw signals and walk them back as +"false positive" in prose, producing whiplash in the rendered report. +v7.1.1 makes severity assignment context-first at the prompt level and +adds a structural counter for suppressed signals. + +### Fixed + +- **Agent prompt context-first severity** (`agents/skill-scanner-agent.md`). + New Step 2.5 mandates that every signal has exactly one disposition — + suppressed (counted only) or reported (full finding) — with the split + happening before severity is assigned. The phrases "false positive", + "legitimate framework", and "no action required" are forbidden in + finding-body text and reserved for the new `## Suppressed Signals` + section. Verdict Logic section was also updated to reference v2 tiers + and cutoffs from `severity.mjs` (BLOCK ≥65, WARNING ≥15) — replaces + the stale v1 sum-and-cap formula that had been left in place after + the v7.0.0 numeric overhaul. +- **Template v1 → v2 risk constants** (`templates/unified-report.md`). + HTML-comment header at lines 55-66 now describes the v2 tiers and + cutoffs the engine has been using since v7.0.0. Adds an + `### Narrative Audit` block inside Executive Summary surfacing + `summary.narrative_audit.suppressed_findings.{count, by_category}` for + reviewer transparency. The block does NOT affect verdict computation. + +### Added + +- **`tests/scanners/skill-scanner-narrative.test.mjs`** — 11 assertions + against `tests/fixtures/skill-scan/hyperframes-like/`. Covers + deterministic content-extractor (exactly 1 HIGH HITL trap, ≥ 2 + framework env-var refs, has_injection true on any signal, + has_critical_injection false), entropy scanner (calibration block + present, ≤ 1 finding after suppression), inline co-monotonicity + guard (`{ high: 1 }` → WARNING / High), and prompt-contract static + assertions on `agents/skill-scanner-agent.md` and + `templates/unified-report.md`. +- **`tests/fixtures/skill-scan/hyperframes-like/`** — synthetic skill + with HTML5 canvas / CSS keyframes / inline SVG data URI noise plus + exactly one genuine HITL trap signal. Committed (not gitignored). + `.llm-security-ignore` uses the canonical `SCANNER:glob` format + (`ENT:**/*.md`). + +### Tests + +- 1511 → 1522 tests (adds 11 new). Co-monotonicity sweep at + `tests/lib/severity.test.mjs:252-303` unchanged and green. + +### Why + +Hyperframes.com re-test on 2026-04-19 produced `risk_score 20 / WARNING / +1 HIGH` numerically (correct after v7.0.0) but the agent listed 8 +findings in prose and walked 6 back as "false positive". v7.1.1 closes +the structural gap that allowed this: severity is assigned ONCE, +context-first, and suppressed signals are categorical telemetry rather +than free-text walk-backs. + +### Out of scope (flagged for Batch B) + +- `commands/scan.md:113-114` retains the v1 risk formula and acts as a + third source of truth alongside agent prompt and severity.mjs. Will + be unified in v7.2.0. + ## [7.1.0] - 2026-04-29 Patch release closing the highest-impact items from the v7.0.0 adversarial review diff --git a/plugins/llm-security/CLAUDE.md b/plugins/llm-security/CLAUDE.md index d99b0db..59f118e 100644 --- a/plugins/llm-security/CLAUDE.md +++ b/plugins/llm-security/CLAUDE.md @@ -10,6 +10,21 @@ Security scanning, auditing, and threat modeling for Claude Code projects. 5 fra See `docs/security-hardening-guide.md` §6 for the calibration story. +**v7.1.1 — Scan-rapport narrative coherence (patch).** Three coordinated +edits address the whiplash symptom that survived v7.0.0 (numbers fixed, +narrative still walked findings back as "false positive" in prose): +(a) `agents/skill-scanner-agent.md` Step 2.5 mandates context-first +severity assignment — every signal has exactly one disposition (suppressed +OR reported), no per-finding walk-back; (b) `templates/unified-report.md` +gains a `### Narrative Audit` block in Executive Summary surfacing +`summary.narrative_audit.suppressed_findings.{count, by_category}` from +the agent's trailing JSON; (c) both files updated from stale v1 +risk-formula constants to the v2 model that has been authoritative in +`severity.mjs` since v7.0.0. Counter is distinct from the existing +top-level `output.suppressed` (`.llm-security-ignore` rule integer). +Out-of-scope but flagged: `commands/scan.md:113-114` retains the v1 +formula; resolution deferred to Batch B. + ## Commands | Command | Description |