ktg-plugin-marketplace/plugins/llm-security/examples/mcp-rug-pull/README.md

# MCP Cumulative Drift (Rug-Pull) Walkthrough

> **WARNING: This is a demonstration fixture, NOT a real attack.**
> No live MCP server is contacted. The script feeds eight synthetic
> tool descriptions through `post-mcp-verify` and shows that the
> v7.3.0 cumulative-drift advisory fires when per-update detection
> would have stayed silent.

## What this demonstrates

**OWASP MCP05 — Rug Pull.** A trusted MCP server is updated in a series
of tiny edits. Each individual update stays under the 10% per-update
Levenshtein threshold, so the original v7.x detection (added before
E14) never trips. But after seven small edits the description has
drifted >25% from the original baseline — the tool now reads "Search
the local data store" instead of "Search the internal knowledge base
... for the agent to read".

`v7.3.0 (E14)` added a sticky **baseline** to the MCP description cache.
Every tool call computes:

- `per_update = levenshtein(current, previous) / |previous|` — fires above 10%
- `cumulative = levenshtein(current, baseline) / max(|current|, |baseline|)` — fires at 25%

This walkthrough proves the cumulative leg catches the slow burn that
the per-update leg misses.

## Drift profile

| Stage | Edit | per-update | cumulative | Advisory |
|-------|------|-----------:|-----------:|----------|
| 0 | baseline | — | 0.0% | seeded only |
| 1 | agent → user | 3.3% | 3.3% | none |
| 2 | ranked → scored | 3.3% | 6.6% | none |
| 3 | short → brief | 4.2% | 10.7% | none |
| 4 | documents → files | 5.8% | 16.5% | none |
| 5 | internal → local | 5.2% | 21.5% | none |
| 6 | base → store | 3.5% | 24.8% | none (just under threshold) |
| 7 | knowledge → data | 7.9% | **32.2%** | **mcp-cumulative-drift (MEDIUM)** |

The exact ratios are reproduced by `string-utils.levenshtein()` — see
`expected-findings.md` for the testable contract.

## How to run

```bash
cd plugins/llm-security
node examples/mcp-rug-pull/run-rug-pull.mjs

# Detailed: show stderr + final cache state
node examples/mcp-rug-pull/run-rug-pull.mjs --verbose
```

Expected: `8 pass, 0 fail`. Stage 7 produces a `SECURITY ADVISORY
(post-mcp-verify)` containing `mcp-cumulative-drift` and the literal
phrase `Slow-burn rug-pull may evade per-update detection`.

## Hooks / scanners involved

- **`hooks/scripts/post-mcp-verify.mjs`** — the only hook invoked.
  Calls into `scanners/lib/mcp-description-cache.mjs::checkDescriptionDrift()`
  for the actual drift math.
- **`scanners/lib/mcp-description-cache.mjs`** — the cache library.
  Stores `{ description, firstSeen, lastSeen, baseline, history }` per
  tool. Baseline survives the 7-day TTL purge.

## Cache isolation

`post-mcp-verify` honors `LLM_SECURITY_MCP_CACHE_FILE` env var (added
v7.3.0 specifically for testing/demos). The script:

1. Creates `mkdtempSync(tmpdir + 'llm-security-rugpull-')`
2. Points the cache at a file inside that tempdir
3. Spawns each hook invocation with the env var set
4. Removes the entire tempdir in `finally{}` before exit

**Your real `~/.cache/llm-security/mcp-descriptions.json` is never
touched.** This is the same pattern used by the unit tests under
`tests/lib/mcp-description-cache.test.mjs`.

## Resetting baseline after a legitimate upgrade

Real MCP servers do upgrade their descriptions occasionally — that's
not always an attack. After confirming the upgrade is genuine, run:

```
/security mcp-baseline-reset                    # clear all baselines
/security mcp-baseline-reset --target mcp__foo  # clear one tool
/security mcp-baseline-reset --list             # see current baselines
```

The next call to `checkDescriptionDrift` after a clear will re-seed
the baseline from whatever incoming description appears. `description`,
`firstSeen`, `lastSeen`, and `history` are preserved for audit.

## OWASP / framework mapping

| Code | Framework | Why |
|------|-----------|-----|
| MCP05 | OWASP MCP Top 10 | Rug-pull / unauthorized tool description change |
| LLM03 | OWASP LLM Top 10 | Supply-chain — compromised MCP server delivers altered behavior |
| ASI04 | OWASP Agentic Top 10 | Untrusted-tool-influence on agent behavior |

## Limitations

- The walkthrough demonstrates only the `mcp-cumulative-drift` MEDIUM
  advisory. It does not exercise:
  - Per-update advisory firing (above 10% in one step) — covered by the
    older v6.x test suite
  - Cache TTL purge (7 days) — would require time mocking
  - History rolling cap (10 events FIFO) — emerges naturally over use
- This is a description-only rug-pull. Behavior changes that don't show
  up in the description (e.g. the server returns different *content*
  while keeping its description) are detected by other layers
  (`post-session-guard` data flow tagging, `post-mcp-verify` content
  scanning of `tool_output`).

## See also

- `docs/security-hardening-guide.md` §6 — calibration story for v7.3.0
- `commands/mcp-baseline-reset.md` — when and how to reset
- `tests/lib/mcp-description-cache.test.mjs` — unit-test contract
- `examples/lethal-trifecta-walkthrough/` — adjacent demonstration of
  another runtime hook
- `expected-findings.md` (in this folder) — the testable contract