Companion to 8df5d5c (which only carried the doc updates — the example
directories themselves were left out of staging by mistake). This
commit adds the actual example mappes:
- examples/lethal-trifecta-walkthrough/{README.md, run-trifecta.mjs,
expected-findings.md}
- examples/mcp-rug-pull/{README.md, run-rug-pull.mjs,
expected-findings.md}
Plus plugin CLAUDE.md "Examples (runnable demonstrations)" section
with a 4-row table covering malicious-skill-demo, prompt-injection-
showcase, lethal-trifecta-walkthrough, and mcp-rug-pull plus the
state-isolation discipline notes.
Marketplace root README unchanged since plugin's outward coverage
is unchanged ([skip-docs] covers the marketplace-level gate).
125 lines
5.1 KiB
Markdown
125 lines
5.1 KiB
Markdown
# MCP Cumulative Drift (Rug-Pull) Walkthrough
|
|
|
|
> **WARNING: This is a demonstration fixture, NOT a real attack.**
|
|
> No live MCP server is contacted. The script feeds eight synthetic
|
|
> tool descriptions through `post-mcp-verify` and shows that the
|
|
> v7.3.0 cumulative-drift advisory fires when per-update detection
|
|
> would have stayed silent.
|
|
|
|
## What this demonstrates
|
|
|
|
**OWASP MCP05 — Rug Pull.** A trusted MCP server is updated in a series
|
|
of tiny edits. Each individual update stays under the 10% per-update
|
|
Levenshtein threshold, so the original v7.x detection (added before
|
|
E14) never trips. But after seven small edits the description has
|
|
drifted >25% from the original baseline — the tool now reads "Search
|
|
the local data store" instead of "Search the internal knowledge base
|
|
... for the agent to read".
|
|
|
|
`v7.3.0 (E14)` added a sticky **baseline** to the MCP description cache.
|
|
Every tool call computes:
|
|
|
|
- `per_update = levenshtein(current, previous) / |previous|` — fires above 10%
|
|
- `cumulative = levenshtein(current, baseline) / max(|current|, |baseline|)` — fires at 25%
|
|
|
|
This walkthrough proves the cumulative leg catches the slow burn that
|
|
the per-update leg misses.
|
|
|
|
## Drift profile
|
|
|
|
| Stage | Edit | per-update | cumulative | Advisory |
|
|
|-------|------|-----------:|-----------:|----------|
|
|
| 0 | baseline | — | 0.0% | seeded only |
|
|
| 1 | agent → user | 3.3% | 3.3% | none |
|
|
| 2 | ranked → scored | 3.3% | 6.6% | none |
|
|
| 3 | short → brief | 4.2% | 10.7% | none |
|
|
| 4 | documents → files | 5.8% | 16.5% | none |
|
|
| 5 | internal → local | 5.2% | 21.5% | none |
|
|
| 6 | base → store | 3.5% | 24.8% | none (just under threshold) |
|
|
| 7 | knowledge → data | 7.9% | **32.2%** | **mcp-cumulative-drift (MEDIUM)** |
|
|
|
|
The exact ratios are reproduced by `string-utils.levenshtein()` — see
|
|
`expected-findings.md` for the testable contract.
|
|
|
|
## How to run
|
|
|
|
```bash
|
|
cd plugins/llm-security
|
|
node examples/mcp-rug-pull/run-rug-pull.mjs
|
|
|
|
# Detailed: show stderr + final cache state
|
|
node examples/mcp-rug-pull/run-rug-pull.mjs --verbose
|
|
```
|
|
|
|
Expected: `8 pass, 0 fail`. Stage 7 produces a `SECURITY ADVISORY
|
|
(post-mcp-verify)` containing `mcp-cumulative-drift` and the literal
|
|
phrase `Slow-burn rug-pull may evade per-update detection`.
|
|
|
|
## Hooks / scanners involved
|
|
|
|
- **`hooks/scripts/post-mcp-verify.mjs`** — the only hook invoked.
|
|
Calls into `scanners/lib/mcp-description-cache.mjs::checkDescriptionDrift()`
|
|
for the actual drift math.
|
|
- **`scanners/lib/mcp-description-cache.mjs`** — the cache library.
|
|
Stores `{ description, firstSeen, lastSeen, baseline, history }` per
|
|
tool. Baseline survives the 7-day TTL purge.
|
|
|
|
## Cache isolation
|
|
|
|
`post-mcp-verify` honors `LLM_SECURITY_MCP_CACHE_FILE` env var (added
|
|
v7.3.0 specifically for testing/demos). The script:
|
|
|
|
1. Creates `mkdtempSync(tmpdir + 'llm-security-rugpull-')`
|
|
2. Points the cache at a file inside that tempdir
|
|
3. Spawns each hook invocation with the env var set
|
|
4. Removes the entire tempdir in `finally{}` before exit
|
|
|
|
**Your real `~/.cache/llm-security/mcp-descriptions.json` is never
|
|
touched.** This is the same pattern used by the unit tests under
|
|
`tests/lib/mcp-description-cache.test.mjs`.
|
|
|
|
## Resetting baseline after a legitimate upgrade
|
|
|
|
Real MCP servers do upgrade their descriptions occasionally — that's
|
|
not always an attack. After confirming the upgrade is genuine, run:
|
|
|
|
```
|
|
/security mcp-baseline-reset # clear all baselines
|
|
/security mcp-baseline-reset --target mcp__foo # clear one tool
|
|
/security mcp-baseline-reset --list # see current baselines
|
|
```
|
|
|
|
The next call to `checkDescriptionDrift` after a clear will re-seed
|
|
the baseline from whatever incoming description appears. `description`,
|
|
`firstSeen`, `lastSeen`, and `history` are preserved for audit.
|
|
|
|
## OWASP / framework mapping
|
|
|
|
| Code | Framework | Why |
|
|
|------|-----------|-----|
|
|
| MCP05 | OWASP MCP Top 10 | Rug-pull / unauthorized tool description change |
|
|
| LLM03 | OWASP LLM Top 10 | Supply-chain — compromised MCP server delivers altered behavior |
|
|
| ASI04 | OWASP Agentic Top 10 | Untrusted-tool-influence on agent behavior |
|
|
|
|
## Limitations
|
|
|
|
- The walkthrough demonstrates only the `mcp-cumulative-drift` MEDIUM
|
|
advisory. It does not exercise:
|
|
- Per-update advisory firing (above 10% in one step) — covered by the
|
|
older v6.x test suite
|
|
- Cache TTL purge (7 days) — would require time mocking
|
|
- History rolling cap (10 events FIFO) — emerges naturally over use
|
|
- This is a description-only rug-pull. Behavior changes that don't show
|
|
up in the description (e.g. the server returns different *content*
|
|
while keeping its description) are detected by other layers
|
|
(`post-session-guard` data flow tagging, `post-mcp-verify` content
|
|
scanning of `tool_output`).
|
|
|
|
## See also
|
|
|
|
- `docs/security-hardening-guide.md` §6 — calibration story for v7.3.0
|
|
- `commands/mcp-baseline-reset.md` — when and how to reset
|
|
- `tests/lib/mcp-description-cache.test.mjs` — unit-test contract
|
|
- `examples/lethal-trifecta-walkthrough/` — adjacent demonstration of
|
|
another runtime hook
|
|
- `expected-findings.md` (in this folder) — the testable contract
|