Wave C step C2: surface the cumulative-drift signal from
checkDescriptionDrift() (added in C1) as a separate MEDIUM advisory
with finding category mcp-cumulative-drift. Independent of the existing
per-update drift advisory — a slow-burn rug-pull that keeps each update
below the 10% per-update threshold but cumulatively drifts >=25% from
the sticky baseline now triggers the new advisory without ever crossing
the per-update bar.
The advisory references /security mcp-baseline-reset (added in C3) so
the user knows how to acknowledge a legitimate MCP server upgrade.
CLAUDE.md updates:
- post-mcp-verify hooks-table row mentions per-update + cumulative drift
- mcp-description-cache lib bullet documents baseline schema, history,
cumulative threshold policy key, and LLM_SECURITY_MCP_CACHE_FILE
override.
Tests: 2 new hook tests using LLM_SECURITY_MCP_CACHE_FILE for cache
isolation. Existing 68 still pass; total 70.
Plugin README and root marketplace README updates land in C3 alongside
the new /security mcp-baseline-reset slash command (combined Wave-C
doc update per plan §"Wave C — Touch" list).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a scope-hopping detector to the npm install gate. When a user
installs `@<scope>/<unscoped>`, the hook now emits a MEDIUM warning
on stderr (exit 0, never blocks) if:
- `<unscoped>` matches a popular npm package (POPULAR_NPM, ~80
names from knowledge/top-packages.json), AND
- `<scope>` is not on NPM_OFFICIAL_SCOPES (built-in 22 entries) or
on policy.json `supply_chain.allowed_scopes`.
Why: an attacker publishing `@evilcorp/lodash` cannot squat the bare
`lodash` name, but they can register an unrelated scope and rely on
typo or copy-paste to trick installs. NPM_OFFICIAL_SCOPES anchors the
known-good scopes (@types, @reduxjs, @nestjs, …) so legitimate
installs stay silent.
Implementation:
- `scanners/lib/supply-chain-data.mjs`: exports POPULAR_NPM,
NPM_OFFICIAL_SCOPES, and `checkScopeHop(name, extraAllowedScopes)` —
pure function, no policy/network dependency, fully unit-testable.
- `knowledge/typosquat-allowlist.json`: mirrors NPM_OFFICIAL_SCOPES as
`npm_official_scopes`. A doc-consistency assertion ensures the two
lists never drift.
- `hooks/scripts/pre-install-supply-chain.mjs`: imports checkScopeHop,
reads `supply_chain.allowed_scopes` from policy, and pushes a
warning before existing compromised/audit checks.
Tests:
- 9 new cases in tests/hooks/pre-install-supply-chain.test.mjs:
TP @evilcorp/lodash, TP @attacker/express, allowlist @types,
allowlist @reduxjs, allowlist @modelcontextprotocol, FP unscoped
name not in top-100, bare unscoped name, policy override, defensive
non-string input, NPM_OFFICIAL_SCOPES <-> typosquat-allowlist.json
consistency.
Adds BLOCK_RULE for the malware-loader pattern:
echo|cat|printf <base64-blob> | base64 -d | <shell>
This is a common RCE delivery shape that bypasses static name-matching
gates by encoding the destructive command as a base64 blob. The new
rule fires only when the final pipe target is a shell interpreter
(bash, sh, zsh, dash, ksh) — base64 decoded into jq or any non-shell
consumer remains allowed.
5 new tests in pre-bash-destructive.test.mjs:
- 3 BLOCK cases (echo|base64|bash, printf|base64|sh, cat|base64|zsh)
- 2 FP probes (base64 -d -> jq passes; base64 -d alone passes)
Closes E9 in critical-review-2026-04-20.md.
The existing CRITICAL pattern in injection-patterns.mjs only fires when
a comment body contains AGENT/AI/HIDDEN markers. Adversaries can drop
the marker and still hide instructions inside <!-- ... --> for any
agent that reads page source. This generalizes the comment scan: every
comment body is HTML-entity-decoded and run through the full
injection rule set. The existing keyword-restricted pattern still
fires (defense-in-depth).
Emits at the strongest tier with category html-comment-injection.
+3 tests (65 → 68).
Refs: Batch B Wave 4 / Step 11 / v7.2.0
SVG containers carry text that is invisible in the rendered image but
fully parsed by an agent reading the source. <desc>, <title>,
<metadata>, and <foreignObject> are all valid surfaces for adversarial
injection.
Adds a per-element extractor inside the existing HTML-tag gate, gated
on /<svg[\s>]/i so it only fires for actual SVG content. Inner text is
HTML-entity-decoded then run through scanForInjection. Emits at the
strongest tier with category svg-element-injection.
+3 tests (62 → 65).
Refs: Batch B Wave 4 / Step 10 / v7.2.0
Adversarial payloads in markdown link title attributes (rendered as
tooltips, parsed by agents) bypassed the existing HTML-content checks
which gated on `<tag>` presence. Pattern: [text](url "title").
Adds linkTitleRegex extraction to the HTML-content block, runs each
captured title through scanForInjection, emits at the strongest tier
encountered with category markdown-link-title-injection.
+3 tests (62 → 62 in post-mcp-verify.test.mjs file, was 59).
Refs: Batch B Wave 4 / Step 9 / v7.2.0
Critical-review §4 E17 finding: pre-v7.2.0 the delegation-after-input
advisory fired only within a 5-call window. Attackers who deliberately
waited 6+ calls before delegating bypassed detection. Window was also
hardcoded — operators couldn't tune it for their environment.
Two coordinated changes:
1. LLM_SECURITY_ESCALATION_WINDOW env var (primary window override)
- parseInt(env) || getPolicyValue('trifecta', 'escalation_window', 5)
- Mirrors the established pattern from
LLM_SECURITY_TRIFECTA_MODE et al.
- Setting env=3 narrows; env=8 expands.
2. Secondary 20-call MEDIUM advisory (slow-burn variant)
- DELEGATION_ESCALATION_WINDOW_MEDIUM = 20 (hardcoded — same value
for all operators; tunable in a future patch if needed)
- checkEscalationAfterInput now returns `tier: 'primary'|'secondary'|null`
- formatEscalationWarning emits a different message for secondary —
mentions "slow-burn", references env-var, distinct from the
primary "DeepMind Category 4" framing
Hook reads max(WINDOW_SIZE, secondary+5) entries to cover the wider
window. Existing duplicate-suppression (`escalation_warning` state
entry) covers both tiers. Audit-trail event captures `tier` field.
Tests: +5 cases in tests/hooks/post-session-guard.test.mjs:
- secondary window catches 9-call distance (slow-burn)
- secondary boundary at exactly 20 calls
- primary regression guard (1-call distance)
- env=3 narrows primary (4-call distance becomes secondary)
- env=8 expands primary (7-call distance stays primary)
Updated existing test "does NOT trigger when input_source is >5 calls
ago" — now requires >20 calls (secondary window catches 6-20).
Suite: 1644 → 1672 (+28 from new tests + extended scope). All green.
CLAUDE.md hooks table updated to document both windows and the env var.
Previously, `LLM_SECURITY_TRIFECTA_MODE=block` only exited 2 when the
detected trifecta was MCP-concentrated (all three legs via the same MCP
server) or involved sensitive-path + exfil. Distributed trifectas —
three legs originating from different tools, with a non-sensitive data
path and a non-sensitive exfiltration sink — were detected and warned
but not blocked. This mismatched the documented semantics of block mode
and gave operators a false sense of enforcement.
Change: remove the `(mcpInfo.concentrated || sensitiveExfil)` AND-gate
in the `TRIFECTA_MODE === 'block'` branch so any detected trifecta
blocks in block mode. Audit event `severity` still differentiates
critical (concentrated / sensitive-exfil) from high (distributed); the
blocked stderr message now explicitly names "Distributed trifecta:
three legs from different sources" when the confidence sub-signals
are absent.
Addresses critical review 2026-04-20 §2 B2 (HIGH) and §9 row 1
("enforces the Rule of Two").
Tests: 1 added (distributed trifecta in block mode now exits 2).
All 1495 tests pass.
The previous ENV regex `/[\\/]\.env\.[a-z]+$/` only matched a single
lowercase segment after `.env`. Multi-segment and mixed-case variants
such as `.env.production.local.backup`, `.env.stage-1.local`, and
`.env.CI.secret` slipped past the hook. Replaced with
`/[\\/]\.env(\.[A-Za-z0-9._-]+)*$/` which matches `.env` plus any
number of dot-separated alphanumeric/dot/hyphen/underscore segments.
`.envrc` (direnv config, no dot separator) is still allowed.
Addresses critical review 2026-04-20 §2 B1 (HIGH).
Tests: 7 added (6 new multi-segment BLOCK cases + 1 .envrc ALLOW).
All 1494 tests pass.