fix(llm-security): A2 batch — JSDoc arithmetic + co-monotonicity test + CaMeL nedton

Closes A2 of v7.1.0 critical-review patch (docs/critical-review-2026-04-20.md): - B4 (severity JSDoc): 4 critical = 93, not 90. Fixed in scanners/lib/severity.mjs:23 and CHANGELOG.md v7.0.0 tier description. The actual computation has always been 93 (70 + log2(5)*10 = 93.22 → round); only the docs were wrong. - §5.4 co-monotonicity: new sweep test in tests/lib/severity.test.mjs over 15 representative count vectors. Asserts that (verdict, riskBand) agree under the v7.0.0 contract for every case — catches future drift between riskScore tiers, verdict cutoffs, and riskBand cutoffs. Includes a B4 anchor test (riskScore {critical: 4} === 93) so doc/code drift fails loudly. - B8 (CaMeL claims toned down): post-session-guard.mjs:646 comment block and CLAUDE.md:184 Defense Philosophy bullet now describe the implementation honestly — opportunistic byte-matching of truncated output fingerprints (first 200 bytes, SHA-256/16-hex), not semantic data-flow tracking. Trivially bypassed by mutation, summarisation, or re-encoding. Inspired by CaMeL (DeepMind 2025), but not a CaMeL capability-tracking implementation. Tests: 1495 → 1511 (+16: 15 sweep cases + 1 B4 anchor). All green.
2026-04-29 11:49:08 +02:00 · 2026-04-29 11:49:08 +02:00 · 4aa5318bcb
commit 4aa5318bcb
parent 36be963d4d
5 changed files with 84 additions and 6 deletions
--- a/plugins/llm-security/hooks/scripts/post-session-guard.mjs
+++ b/plugins/llm-security/hooks/scripts/post-session-guard.mjs
@ -643,12 +643,22 @@ function formatDriftWarning(jsd, firstTools, lastTools) {
 }

 // ---------------------------------------------------------------------------
-// CaMeL-inspired data flow tagging (DeepMind CaMeL, v5.0 S6)
+// Output fingerprint matching (inspired by CaMeL, DeepMind 2025; v5.0 S6)
+//
+// NOTE: This is opportunistic byte-matching of truncated output fingerprints,
+// not semantic data-flow tracking. We hash the first 200 bytes of tool output
+// (SHA-256, truncated to 16 hex chars) and check whether that exact tag
+// appears verbatim in the next tool input. Trivially bypassed by:
+//   - Mutating any of the first 200 bytes
+//   - Summarising the output before passing it on
+//   - Re-encoding (base64, JSON-escape, whitespace changes)
+// Inspired by CaMeL but NOT a CaMeL capability-tracking implementation.
 // ---------------------------------------------------------------------------

 /**
- * Compute a short data tag from tool output (first 200 chars, SHA-256 truncated to 16 hex).
- * Used for lightweight data provenance tracking.
+ * Compute a short output fingerprint from tool output (first 200 chars,
+ * SHA-256 truncated to 16 hex). Used for opportunistic byte-matching, not
+ * semantic provenance.
 * @param {string} text - tool output text
 * @returns {string} 16-char hex hash
 */