# MCP Cumulative Drift (Rug-Pull) Walkthrough > **WARNING: This is a demonstration fixture, NOT a real attack.** > No live MCP server is contacted. The script feeds eight synthetic > tool descriptions through `post-mcp-verify` and shows that the > v7.3.0 cumulative-drift advisory fires when per-update detection > would have stayed silent. ## What this demonstrates **OWASP MCP05 — Rug Pull.** A trusted MCP server is updated in a series of tiny edits. Each individual update stays under the 10% per-update Levenshtein threshold, so the original v7.x detection (added before E14) never trips. But after seven small edits the description has drifted >25% from the original baseline — the tool now reads "Search the local data store" instead of "Search the internal knowledge base ... for the agent to read". `v7.3.0 (E14)` added a sticky **baseline** to the MCP description cache. Every tool call computes: - `per_update = levenshtein(current, previous) / |previous|` — fires above 10% - `cumulative = levenshtein(current, baseline) / max(|current|, |baseline|)` — fires at 25% This walkthrough proves the cumulative leg catches the slow burn that the per-update leg misses. ## Drift profile | Stage | Edit | per-update | cumulative | Advisory | |-------|------|-----------:|-----------:|----------| | 0 | baseline | — | 0.0% | seeded only | | 1 | agent → user | 3.3% | 3.3% | none | | 2 | ranked → scored | 3.3% | 6.6% | none | | 3 | short → brief | 4.2% | 10.7% | none | | 4 | documents → files | 5.8% | 16.5% | none | | 5 | internal → local | 5.2% | 21.5% | none | | 6 | base → store | 3.5% | 24.8% | none (just under threshold) | | 7 | knowledge → data | 7.9% | **32.2%** | **mcp-cumulative-drift (MEDIUM)** | The exact ratios are reproduced by `string-utils.levenshtein()` — see `expected-findings.md` for the testable contract. ## How to run ```bash cd plugins/llm-security node examples/mcp-rug-pull/run-rug-pull.mjs # Detailed: show stderr + final cache state node examples/mcp-rug-pull/run-rug-pull.mjs --verbose ``` Expected: `8 pass, 0 fail`. Stage 7 produces a `SECURITY ADVISORY (post-mcp-verify)` containing `mcp-cumulative-drift` and the literal phrase `Slow-burn rug-pull may evade per-update detection`. ## Hooks / scanners involved - **`hooks/scripts/post-mcp-verify.mjs`** — the only hook invoked. Calls into `scanners/lib/mcp-description-cache.mjs::checkDescriptionDrift()` for the actual drift math. - **`scanners/lib/mcp-description-cache.mjs`** — the cache library. Stores `{ description, firstSeen, lastSeen, baseline, history }` per tool. Baseline survives the 7-day TTL purge. ## Cache isolation `post-mcp-verify` honors `LLM_SECURITY_MCP_CACHE_FILE` env var (added v7.3.0 specifically for testing/demos). The script: 1. Creates `mkdtempSync(tmpdir + 'llm-security-rugpull-')` 2. Points the cache at a file inside that tempdir 3. Spawns each hook invocation with the env var set 4. Removes the entire tempdir in `finally{}` before exit **Your real `~/.cache/llm-security/mcp-descriptions.json` is never touched.** This is the same pattern used by the unit tests under `tests/lib/mcp-description-cache.test.mjs`. ## Resetting baseline after a legitimate upgrade Real MCP servers do upgrade their descriptions occasionally — that's not always an attack. After confirming the upgrade is genuine, run: ``` /security mcp-baseline-reset # clear all baselines /security mcp-baseline-reset --target mcp__foo # clear one tool /security mcp-baseline-reset --list # see current baselines ``` The next call to `checkDescriptionDrift` after a clear will re-seed the baseline from whatever incoming description appears. `description`, `firstSeen`, `lastSeen`, and `history` are preserved for audit. ## OWASP / framework mapping | Code | Framework | Why | |------|-----------|-----| | MCP05 | OWASP MCP Top 10 | Rug-pull / unauthorized tool description change | | LLM03 | OWASP LLM Top 10 | Supply-chain — compromised MCP server delivers altered behavior | | ASI04 | OWASP Agentic Top 10 | Untrusted-tool-influence on agent behavior | ## Limitations - The walkthrough demonstrates only the `mcp-cumulative-drift` MEDIUM advisory. It does not exercise: - Per-update advisory firing (above 10% in one step) — covered by the older v6.x test suite - Cache TTL purge (7 days) — would require time mocking - History rolling cap (10 events FIFO) — emerges naturally over use - This is a description-only rug-pull. Behavior changes that don't show up in the description (e.g. the server returns different *content* while keeping its description) are detected by other layers (`post-session-guard` data flow tagging, `post-mcp-verify` content scanning of `tool_output`). ## See also - `docs/security-hardening-guide.md` §6 — calibration story for v7.3.0 - `commands/mcp-baseline-reset.md` — when and how to reset - `tests/lib/mcp-description-cache.test.mjs` — unit-test contract - `examples/lethal-trifecta-walkthrough/` — adjacent demonstration of another runtime hook - `expected-findings.md` (in this folder) — the testable contract