fix(llm-security): A2 batch — JSDoc arithmetic + co-monotonicity test + CaMeL nedton
Closes A2 of v7.1.0 critical-review patch (docs/critical-review-2026-04-20.md):
- B4 (severity JSDoc): 4 critical = 93, not 90. Fixed in scanners/lib/severity.mjs:23
and CHANGELOG.md v7.0.0 tier description. The actual computation has always been
93 (70 + log2(5)*10 = 93.22 → round); only the docs were wrong.
- §5.4 co-monotonicity: new sweep test in tests/lib/severity.test.mjs over 15
representative count vectors. Asserts that (verdict, riskBand) agree under the
v7.0.0 contract for every case — catches future drift between riskScore tiers,
verdict cutoffs, and riskBand cutoffs. Includes a B4 anchor test (riskScore
{critical: 4} === 93) so doc/code drift fails loudly.
- B8 (CaMeL claims toned down): post-session-guard.mjs:646 comment block and
CLAUDE.md:184 Defense Philosophy bullet now describe the implementation
honestly — opportunistic byte-matching of truncated output fingerprints
(first 200 bytes, SHA-256/16-hex), not semantic data-flow tracking.
Trivially bypassed by mutation, summarisation, or re-encoding. Inspired by
CaMeL (DeepMind 2025), but not a CaMeL capability-tracking implementation.
Tests: 1495 → 1511 (+16: 15 sweep cases + 1 B4 anchor). All green.
This commit is contained in:
parent
36be963d4d
commit
4aa5318bcb
5 changed files with 84 additions and 6 deletions
|
|
@ -643,12 +643,22 @@ function formatDriftWarning(jsd, firstTools, lastTools) {
|
|||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// CaMeL-inspired data flow tagging (DeepMind CaMeL, v5.0 S6)
|
||||
// Output fingerprint matching (inspired by CaMeL, DeepMind 2025; v5.0 S6)
|
||||
//
|
||||
// NOTE: This is opportunistic byte-matching of truncated output fingerprints,
|
||||
// not semantic data-flow tracking. We hash the first 200 bytes of tool output
|
||||
// (SHA-256, truncated to 16 hex chars) and check whether that exact tag
|
||||
// appears verbatim in the next tool input. Trivially bypassed by:
|
||||
// - Mutating any of the first 200 bytes
|
||||
// - Summarising the output before passing it on
|
||||
// - Re-encoding (base64, JSON-escape, whitespace changes)
|
||||
// Inspired by CaMeL but NOT a CaMeL capability-tracking implementation.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Compute a short data tag from tool output (first 200 chars, SHA-256 truncated to 16 hex).
|
||||
* Used for lightweight data provenance tracking.
|
||||
* Compute a short output fingerprint from tool output (first 200 chars,
|
||||
* SHA-256 truncated to 16 hex). Used for opportunistic byte-matching, not
|
||||
* semantic provenance.
|
||||
* @param {string} text - tool output text
|
||||
* @returns {string} 16-char hex hash
|
||||
*/
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue