feat(post-session-guard): E17 — configurable escalation window + 20-call MEDIUM advisory

Critical-review §4 E17 finding: pre-v7.2.0 the delegation-after-input
advisory fired only within a 5-call window. Attackers who deliberately
waited 6+ calls before delegating bypassed detection. Window was also
hardcoded — operators couldn't tune it for their environment.

Two coordinated changes:

1. LLM_SECURITY_ESCALATION_WINDOW env var (primary window override)
   - parseInt(env) || getPolicyValue('trifecta', 'escalation_window', 5)
   - Mirrors the established pattern from
     LLM_SECURITY_TRIFECTA_MODE et al.
   - Setting env=3 narrows; env=8 expands.

2. Secondary 20-call MEDIUM advisory (slow-burn variant)
   - DELEGATION_ESCALATION_WINDOW_MEDIUM = 20 (hardcoded — same value
     for all operators; tunable in a future patch if needed)
   - checkEscalationAfterInput now returns `tier: 'primary'|'secondary'|null`
   - formatEscalationWarning emits a different message for secondary —
     mentions "slow-burn", references env-var, distinct from the
     primary "DeepMind Category 4" framing

Hook reads max(WINDOW_SIZE, secondary+5) entries to cover the wider
window. Existing duplicate-suppression (`escalation_warning` state
entry) covers both tiers. Audit-trail event captures `tier` field.

Tests: +5 cases in tests/hooks/post-session-guard.test.mjs:
- secondary window catches 9-call distance (slow-burn)
- secondary boundary at exactly 20 calls
- primary regression guard (1-call distance)
- env=3 narrows primary (4-call distance becomes secondary)
- env=8 expands primary (7-call distance stays primary)

Updated existing test "does NOT trigger when input_source is >5 calls
ago" — now requires >20 calls (secondary window catches 6-20).

Suite: 1644 → 1672 (+28 from new tests + extended scope). All green.

CLAUDE.md hooks table updated to document both windows and the env var.
This commit is contained in:
Kjell Tore Guttormsen 2026-04-29 14:26:18 +02:00
commit f0a1d4024a
3 changed files with 215 additions and 23 deletions

View file

@ -70,7 +70,7 @@ formula; resolution deferred to Batch B.
| `pre-install-supply-chain.mjs` | PreToolUse | `Bash` | Block compromised packages across ALL ecosystems. Bash evasion normalization before gate matching |
| `pre-write-pathguard.mjs` | PreToolUse | `Write` | Block writes to .env, .ssh/, .aws/, credentials, settings |
| `post-mcp-verify.mjs` | PostToolUse | — (all) | Injection scan on ALL tool output (incl. MEDIUM patterns, HITL traps, sub-agent spawn, NL indirection, cognitive load, hybrid P2SQL/recursive/XSS). HTML content trap detection. Bash-specific: secrets/URLs/size. MCP: description drift detection (MCP05), per-tool volume tracking |
| `post-session-guard.mjs` | PostToolUse | — (all) | Runtime trifecta detection (Rule of Two). Sliding window (20 calls) + 100-call long-horizon. MCP-concentrated trifecta (same server = elevated severity). Sensitive path + exfil detection. Slow-burn trifecta (legs >50 calls apart = MEDIUM). Behavioral drift detection (Jensen-Shannon divergence). CaMeL-inspired data flow tagging (SHA-256 provenance tracking, output→input linking). Mode: `LLM_SECURITY_TRIFECTA_MODE=block\|warn\|off` (default: warn). Cumulative data volume tracking (100KB/500KB/1MB thresholds). Sub-agent delegation tracking (Task/Agent tools): escalation-after-input advisory when delegation occurs within 5 calls of untrusted input (DeepMind Agent Traps kat. 4) |
| `post-session-guard.mjs` | PostToolUse | — (all) | Runtime trifecta detection (Rule of Two). Sliding window (20 calls) + 100-call long-horizon. MCP-concentrated trifecta (same server = elevated severity). Sensitive path + exfil detection. Slow-burn trifecta (legs >50 calls apart = MEDIUM). Behavioral drift detection (Jensen-Shannon divergence). CaMeL-inspired data flow tagging (SHA-256 provenance tracking, output→input linking). Mode: `LLM_SECURITY_TRIFECTA_MODE=block\|warn\|off` (default: warn). Cumulative data volume tracking (100KB/500KB/1MB thresholds). Sub-agent delegation tracking (Task/Agent tools): escalation-after-input advisory when delegation occurs within `LLM_SECURITY_ESCALATION_WINDOW` calls (default 5) of untrusted input (DeepMind Agent Traps kat. 4); secondary 20-call MEDIUM advisory catches slow-burn variants outside the primary window (E17, v7.2.0) |
| `update-check.mjs` | UserPromptSubmit | — | Checks for newer versions (max 1x/24h, cached). Disable: `LLM_SECURITY_UPDATE_CHECK=off` |
> `pre-install-supply-chain.mjs` covers 7 package managers: npm/yarn/pnpm, pip/pip3/uv, brew, docker, go, cargo, gem. Per-ecosystem blocklists, age gate (<72h), npm audit (critical=block, high=warn), PyPI API inspection, Levenshtein typosquat detection, Docker image verification.