feat(post-mcp-verify): E14 part 2 — cumulative-drift MEDIUM advisory [skip-docs]

Wave C step C2: surface the cumulative-drift signal from
checkDescriptionDrift() (added in C1) as a separate MEDIUM advisory
with finding category mcp-cumulative-drift. Independent of the existing
per-update drift advisory — a slow-burn rug-pull that keeps each update
below the 10% per-update threshold but cumulatively drifts >=25% from
the sticky baseline now triggers the new advisory without ever crossing
the per-update bar.

The advisory references /security mcp-baseline-reset (added in C3) so
the user knows how to acknowledge a legitimate MCP server upgrade.

CLAUDE.md updates:
- post-mcp-verify hooks-table row mentions per-update + cumulative drift
- mcp-description-cache lib bullet documents baseline schema, history,
  cumulative threshold policy key, and LLM_SECURITY_MCP_CACHE_FILE
  override.

Tests: 2 new hook tests using LLM_SECURITY_MCP_CACHE_FILE for cache
isolation. Existing 68 still pass; total 70.

Plugin README and root marketplace README updates land in C3 alongside
the new /security mcp-baseline-reset slash command (combined Wave-C
doc update per plan §"Wave C — Touch" list).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-04-30 16:40:52 +02:00
commit 427b68eca9
3 changed files with 109 additions and 4 deletions

View file

@ -69,7 +69,7 @@ formula; resolution deferred to Batch B.
| `pre-bash-destructive.mjs` | PreToolUse | `Bash` | Block rm -rf, curl\|sh, fork bombs, eval. Bash evasion normalization (T1-T6 via `bash-normalize.mjs`: empty quotes, ${} expansion, backslash splitting, IFS, ANSI-C hex) — defense-in-depth mot T1-T6; Claude Code 2.1.98+ dekker harness-nivå |
| `pre-install-supply-chain.mjs` | PreToolUse | `Bash` | Block compromised packages across ALL ecosystems. Bash evasion normalization before gate matching |
| `pre-write-pathguard.mjs` | PreToolUse | `Write` | Block writes to .env, .ssh/, .aws/, credentials, settings |
| `post-mcp-verify.mjs` | PostToolUse | — (all) | Injection scan on ALL tool output (incl. MEDIUM patterns, HITL traps, sub-agent spawn, NL indirection, cognitive load, hybrid P2SQL/recursive/XSS). HTML content trap detection. Bash-specific: secrets/URLs/size. MCP: description drift detection (MCP05), per-tool volume tracking |
| `post-mcp-verify.mjs` | PostToolUse | — (all) | Injection scan on ALL tool output (incl. MEDIUM patterns, HITL traps, sub-agent spawn, NL indirection, cognitive load, hybrid P2SQL/recursive/XSS). HTML content trap detection. Bash-specific: secrets/URLs/size. MCP: per-update description drift (MCP05) AND cumulative drift vs sticky baseline (E14, v7.3.0) — slow-burn rug-pulls that stay under the per-update threshold but diverge >=25% from baseline emit MEDIUM `mcp-cumulative-drift` advisory. Per-tool volume tracking |
| `post-session-guard.mjs` | PostToolUse | — (all) | Runtime trifecta detection (Rule of Two). Sliding window (20 calls) + 100-call long-horizon. MCP-concentrated trifecta (same server = elevated severity). Sensitive path + exfil detection. Slow-burn trifecta (legs >50 calls apart = MEDIUM). Behavioral drift detection (Jensen-Shannon divergence). CaMeL-inspired data flow tagging (SHA-256 provenance tracking, output→input linking). Mode: `LLM_SECURITY_TRIFECTA_MODE=block\|warn\|off` (default: warn). Cumulative data volume tracking (100KB/500KB/1MB thresholds). Sub-agent delegation tracking (Task/Agent tools): escalation-after-input advisory when delegation occurs within `LLM_SECURITY_ESCALATION_WINDOW` calls (default 5) of untrusted input (DeepMind Agent Traps kat. 4); secondary 20-call MEDIUM advisory catches slow-burn variants outside the primary window (E17, v7.2.0) |
| `update-check.mjs` | UserPromptSubmit | — | Checks for newer versions (max 1x/24h, cached). Disable: `LLM_SECURITY_UPDATE_CHECK=off` |
@ -96,7 +96,7 @@ Post-clone: size check (100MB max), cleanup guarantee (temp dir + evidence file
With `--output-file`: full JSON to file, compact aggregate to stdout. `--baseline` diffs against stored baseline. `--save-baseline` saves results for future diffs. Baselines stored in `reports/baselines/<target-hash>.json`.
10 scanners: unicode, entropy, permission, dep-audit, taint, git-forensics, network, memory-poisoning, supply-chain-recheck, toxic-flow.
Lib: `mcp-description-cache.mjs` — caches MCP tool descriptions in `~/.cache/llm-security/mcp-descriptions.json`, detects drift via Levenshtein (>10% = alert), 7-day TTL. Used by `post-mcp-verify.mjs`.
Lib: `mcp-description-cache.mjs` — caches MCP tool descriptions in `~/.cache/llm-security/mcp-descriptions.json`, detects per-update drift via Levenshtein (>10% = alert), 7-day TTL. v7.3.0 (E14) adds a sticky baseline slot per tool plus a 10-event rolling history; cumulative drift = `levenshtein(current, baseline) / max(|current|,|baseline|)`. When ratio ≥ `mcp.cumulative_drift_threshold` (default 0.25), emits `mcp-cumulative-drift` advisory through `post-mcp-verify.mjs`. Baseline survives TTL purge so slow-burn drift is preserved across the 7-day window. `clearBaseline(tool?)` exposed for the `/security mcp-baseline-reset` command. `LLM_SECURITY_MCP_CACHE_FILE` env var overrides the cache path for testing.
Supply-chain-recheck (SCR) re-audits installed dependencies from lockfiles (package-lock.json, yarn.lock, requirements.txt, Pipfile.lock) against blocklists, OSV.dev batch API, and typosquat detection. Offline fallback available. Shared data module: `scanners/lib/supply-chain-data.mjs`.
Memory-poisoning (MEM) detects cognitive state poisoning in CLAUDE.md, memory files, and .claude/rules — injection patterns, shell commands, credential paths, permission expansion, suspicious URLs, encoded payloads.
Toxic-flow (TFA) is a post-processing correlator that runs LAST — detects "lethal trifecta" (untrusted input + sensitive data access + exfiltration sink) by correlating output from prior scanners.

View file

@ -421,6 +421,11 @@ if (isHtmlSource && outputText.length >= MIN_INJECTION_SCAN_LENGTH) {
// =========================================================================
// MCP description drift detection (OWASP MCP05 — Rug Pull)
// Checks if the MCP tool's description has changed since first seen.
// Two signals:
// - per-update drift (>10% Levenshtein vs previous)
// - cumulative drift (>=25% Levenshtein vs sticky baseline) — catches
// slow-burn rug-pulls where each update stays under the per-update
// threshold but cumulatively diverges from the original. v7.3.0 / E14.
// Only relevant for MCP tools that provide a description in tool_input.
// =========================================================================
const isMcpTool = toolName?.startsWith('mcp__');
@ -438,6 +443,19 @@ if (isMcpTool && !isTrustedMcp) {
` A changed tool description may indicate the MCP server has been compromised.`
);
}
// Cumulative-drift advisory (mcp-cumulative-drift, MEDIUM). Independent
// of per-update drift — a slow-burn rug-pull triggers this without
// ever crossing the per-update threshold.
if (driftResult.cumulative && driftResult.cumulative.drifted) {
const baselineDesc = driftResult.cumulative.baseline || '';
advisories.push(
`MCP tool cumulative description drift — MEDIUM (mcp-cumulative-drift, OWASP MCP05).\n` +
` ${driftResult.cumulative.detail}\n` +
` Baseline: "${baselineDesc.slice(0, 120)}${baselineDesc.length > 120 ? '...' : ''}"\n` +
` Current: "${description.slice(0, 120)}${description.length > 120 ? '...' : ''}"\n` +
` Reset the baseline after a legitimate MCP server upgrade with: /security mcp-baseline-reset`
);
}
} catch { /* drift check is advisory, never block */ }
}
}

View file

@ -11,8 +11,10 @@
import { describe, it } from 'node:test';
import assert from 'node:assert/strict';
import { resolve } from 'node:path';
import { runHook } from './hook-helper.mjs';
import { resolve, join } from 'node:path';
import { mkdtempSync, rmSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { runHook, runHookWithEnv } from './hook-helper.mjs';
const SCRIPT = resolve(import.meta.dirname, '../../hooks/scripts/post-mcp-verify.mjs');
@ -399,6 +401,91 @@ describe('post-mcp-verify — MCP description drift detection', () => {
});
});
// ---------------------------------------------------------------------------
// MCP cumulative description drift (E14 / v7.3.0)
// Five sub-10% updates that cumulatively diverge >25% from baseline.
// LLM_SECURITY_MCP_CACHE_FILE isolates the cache file so the test does not
// pollute the user's real ~/.cache/llm-security/mcp-descriptions.json.
// ---------------------------------------------------------------------------
describe('post-mcp-verify — MCP cumulative drift advisory (E14)', () => {
it('emits MEDIUM mcp-cumulative-drift advisory after slow-burn drift', async () => {
const dir = mkdtempSync(join(tmpdir(), 'mcp-cumdrift-test-'));
const cacheFile = join(dir, 'mcp-descriptions.json');
const env = { LLM_SECURITY_MCP_CACHE_FILE: cacheFile };
const tool = 'mcp__creep__search';
// Seed the baseline with a long description
const v0 = 'Search the web for current information about technology and science topics from reliable sources.';
let result = await runHookWithEnv(SCRIPT, {
tool_name: tool,
tool_input: { description: v0 },
tool_output: 'A clean output line padded with extra characters so the injection scan threshold is met.',
}, env);
assert.equal(result.code, 0);
assert.equal(parseAdvisory(result.stdout), null, 'first call seeds baseline, no advisory');
// Five small mutations that each stay below 10% per-update drift
const mutations = [
'Search the web for current information about technology and science topics from trusted sources.',
'Search the web for recent information about technology and science topics from trusted sources.',
'Search the web for recent information about technology and science topics including trusted sources.',
'Search the web for recent information about technology, science, and engineering topics including trusted sources.',
'Search the web for recent information about technology, science, engineering, and medicine topics including trusted sources.',
];
let lastResult = null;
for (const m of mutations) {
lastResult = await runHookWithEnv(SCRIPT, {
tool_name: tool,
tool_input: { description: m },
tool_output: 'A clean output line padded with extra characters so the injection scan threshold is met.',
}, env);
assert.equal(lastResult.code, 0);
}
const adv = parseAdvisory(lastResult.stdout);
assert.ok(adv, 'cumulative drift advisory emitted on the final mutation');
assert.ok(
adv.systemMessage.includes('mcp-cumulative-drift'),
'advisory includes finding category mcp-cumulative-drift',
);
assert.ok(adv.systemMessage.includes('MEDIUM'), 'advisory severity is MEDIUM');
assert.ok(adv.systemMessage.includes('MCP05'), 'advisory references OWASP MCP05');
assert.ok(
adv.systemMessage.includes('/security mcp-baseline-reset'),
'advisory mentions reset command for legitimate upgrades',
);
rmSync(dir, { recursive: true, force: true });
});
it('no cumulative-drift advisory for stable descriptions across many calls', async () => {
const dir = mkdtempSync(join(tmpdir(), 'mcp-cumdrift-stable-'));
const cacheFile = join(dir, 'mcp-descriptions.json');
const env = { LLM_SECURITY_MCP_CACHE_FILE: cacheFile };
const tool = 'mcp__stable__t';
const desc = 'A stable, descriptive tool for searching the public web.';
for (let i = 0; i < 6; i++) {
const result = await runHookWithEnv(SCRIPT, {
tool_name: tool,
tool_input: { description: desc },
tool_output: 'Clean output line padded with extra characters so the injection scan threshold is met.',
}, env);
assert.equal(result.code, 0);
const adv = parseAdvisory(result.stdout);
// Either null (no advisory) or no cumulative-drift mention
if (adv) {
assert.ok(
!adv.systemMessage.includes('mcp-cumulative-drift'),
'no cumulative-drift advisory for stable description',
);
}
}
rmSync(dir, { recursive: true, force: true });
});
});
// ---------------------------------------------------------------------------
// MCP per-tool volume tracking (NEW in v4.3.0)
// ---------------------------------------------------------------------------