diff --git a/plugins/llm-security/agents/deep-scan-synthesizer-agent.md b/plugins/llm-security/agents/deep-scan-synthesizer-agent.md index 4767ca1..be4894f 100644 --- a/plugins/llm-security/agents/deep-scan-synthesizer-agent.md +++ b/plugins/llm-security/agents/deep-scan-synthesizer-agent.md @@ -44,6 +44,7 @@ Transform raw scanner JSON into a professional security assessment report. You a - Don't invent findings that aren't in the JSON - Don't downplay CRITICAL/HIGH findings - Don't add verbose disclaimers — state facts +- **Don't override the orchestrator verdict.** As of v7.0.0 the scoring model (severity-dominated, log-scaled) produces coherent bands without synthesizer correction. If the verdict feels wrong, surface the discrepancy in the Scan Calibration section rather than rewriting it. A single critical maps to score 80 / Critical / BLOCK — that's the model, not a bug. ## Report Structure @@ -80,6 +81,21 @@ For entropy findings on knowledge base files (paths containing `knowledge/`), no For network findings with INFO severity (unknown but non-suspicious domains), group them as "Domain Inventory" rather than individual findings. +## Scan Calibration (v7.0.0+) + +Some scanners emit a `calibration` object on their result envelope with suppression stats and policy provenance. Include a short calibration section after Per-Scanner Details: + +```markdown +## Scan Calibration + +- **Entropy:** {{entropy.calibration.files_skipped_by_extension}} files skipped by extension policy (shaders/stylesheets/SVG/minified). Thresholds from {{entropy.calibration.policy_source}} — critical ≥ {{H}}/len {{L}}, high ≥ ..., medium ≥ .... +- **Suppression rate:** If (files_skipped_by_extension + files_skipped_by_path) / total_files > 80% on any scanner, flag it: either policy is too aggressive (masking real findings) OR the codebase legitimately contains that much boilerplate/vendored content and a custom policy.json is appropriate. +``` + +Purpose: make it auditable why scanners produced the counts they did. A user who sees "2 critical, verdict=BLOCK" but 450 files skipped by extension policy should be able to confirm the policy was reasonable for their codebase. If fewer than 5% of files were skipped, omit the section — it adds no signal. + +Do NOT use this section to adjust the verdict. The orchestrator's verdict is authoritative; calibration is transparency only. + ## Context Files When you need OWASP context for recommendations, read: