From 915aca69e47e9a28e130dae143e9147ca7a6600d Mon Sep 17 00:00:00 2001
From: Kjell Tore Guttormsen <ktg@humanize.no>
Date: Sun, 19 Apr 2026 22:04:29 +0200
Subject: [PATCH] =?UTF-8?q?feat(llm-security):=20v7.0.0=20commit=205=20?=
 =?UTF-8?q?=E2=80=94=20synthesizer=20scan=20calibration=20section?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Makes suppression stats visible in the deep-scan report so users can
audit why the scanner produced the counts it did. Before: synthesizer
would acknowledge "true risk is High, not Extreme" in prose while
verdict stayed BLOCK/Extreme — inconsistent. After Commit 1 the
orchestrator verdict is coherent on its own; synthesizer's job shrinks
to transparency.

- Adds 'Scan Calibration' section instruction consuming
  scanner.calibration.* fields (entropy files_skipped_by_extension,
  policy_source, thresholds).
- Heuristic: omit the section if < 5% of files skipped (no signal).
  Flag the section if > 80% skipped (policy may be too aggressive).
- Explicit 'Don't override verdict' directive in DON'T DO list.
  Discrepancy goes in calibration, not in a rewritten dashboard.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../agents/deep-scan-synthesizer-agent.md        | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/plugins/llm-security/agents/deep-scan-synthesizer-agent.md b/plugins/llm-security/agents/deep-scan-synthesizer-agent.md
index 4767ca1..be4894f 100644
--- a/plugins/llm-security/agents/deep-scan-synthesizer-agent.md
+++ b/plugins/llm-security/agents/deep-scan-synthesizer-agent.md
@@ -44,6 +44,7 @@ Transform raw scanner JSON into a professional security assessment report. You a
 - Don't invent findings that aren't in the JSON
 - Don't downplay CRITICAL/HIGH findings
 - Don't add verbose disclaimers — state facts
+- **Don't override the orchestrator verdict.** As of v7.0.0 the scoring model (severity-dominated, log-scaled) produces coherent bands without synthesizer correction. If the verdict feels wrong, surface the discrepancy in the Scan Calibration section rather than rewriting it. A single critical maps to score 80 / Critical / BLOCK — that's the model, not a bug.
 
 ## Report Structure
 
@@ -80,6 +81,21 @@ For entropy findings on knowledge base files (paths containing `knowledge/`), no
 
 For network findings with INFO severity (unknown but non-suspicious domains), group them as "Domain Inventory" rather than individual findings.
 
+## Scan Calibration (v7.0.0+)
+
+Some scanners emit a `calibration` object on their result envelope with suppression stats and policy provenance. Include a short calibration section after Per-Scanner Details:
+
+```markdown
+## Scan Calibration
+
+- **Entropy:** {{entropy.calibration.files_skipped_by_extension}} files skipped by extension policy (shaders/stylesheets/SVG/minified). Thresholds from {{entropy.calibration.policy_source}} — critical ≥ {{H}}/len {{L}}, high ≥ ..., medium ≥ ....
+- **Suppression rate:** If (files_skipped_by_extension + files_skipped_by_path) / total_files > 80% on any scanner, flag it: either policy is too aggressive (masking real findings) OR the codebase legitimately contains that much boilerplate/vendored content and a custom policy.json is appropriate.
+```
+
+Purpose: make it auditable why scanners produced the counts they did. A user who sees "2 critical, verdict=BLOCK" but 450 files skipped by extension policy should be able to confirm the policy was reasonable for their codebase. If fewer than 5% of files were skipped, omit the section — it adds no signal.
+
+Do NOT use this section to adjust the verdict. The orchestrator's verdict is authoritative; calibration is transparency only.
+
 ## Context Files
 
 When you need OWASP context for recommendations, read: