History

Kjell Tore Guttormsen 583a78c6cc feat(llm-security): add lethal-trifecta + mcp-rug-pull example contents [skip-docs] Companion to `8df5d5c` (which only carried the doc updates — the example directories themselves were left out of staging by mistake). This commit adds the actual example mappes: - examples/lethal-trifecta-walkthrough/{README.md, run-trifecta.mjs, expected-findings.md} - examples/mcp-rug-pull/{README.md, run-rug-pull.mjs, expected-findings.md} Plus plugin CLAUDE.md "Examples (runnable demonstrations)" section with a 4-row table covering malicious-skill-demo, prompt-injection- showcase, lethal-trifecta-walkthrough, and mcp-rug-pull plus the state-isolation discipline notes. Marketplace root README unchanged since plugin's outward coverage is unchanged ([skip-docs] covers the marketplace-level gate).		2026-05-05 14:45:39 +02:00
..
expected-findings.md	feat(llm-security): add lethal-trifecta + mcp-rug-pull example contents [skip-docs]	2026-05-05 14:45:39 +02:00
README.md	feat(llm-security): add lethal-trifecta + mcp-rug-pull example contents [skip-docs]	2026-05-05 14:45:39 +02:00
run-rug-pull.mjs	feat(llm-security): add lethal-trifecta + mcp-rug-pull example contents [skip-docs]	2026-05-05 14:45:39 +02:00

README.md

MCP Cumulative Drift (Rug-Pull) Walkthrough

WARNING: This is a demonstration fixture, NOT a real attack. No live MCP server is contacted. The script feeds eight synthetic tool descriptions through post-mcp-verify and shows that the v7.3.0 cumulative-drift advisory fires when per-update detection would have stayed silent.

What this demonstrates

OWASP MCP05 — Rug Pull. A trusted MCP server is updated in a series of tiny edits. Each individual update stays under the 10% per-update Levenshtein threshold, so the original v7.x detection (added before E14) never trips. But after seven small edits the description has drifted >25% from the original baseline — the tool now reads "Search the local data store" instead of "Search the internal knowledge base ... for the agent to read".

v7.3.0 (E14) added a sticky baseline to the MCP description cache. Every tool call computes:

per_update = levenshtein(current, previous) / |previous| — fires above 10%
cumulative = levenshtein(current, baseline) / max(|current|, |baseline|) — fires at 25%

This walkthrough proves the cumulative leg catches the slow burn that the per-update leg misses.

Drift profile

Stage	Edit	per-update	cumulative	Advisory
0	baseline	—	0.0%	seeded only
1	agent → user	3.3%	3.3%	none
2	ranked → scored	3.3%	6.6%	none
3	short → brief	4.2%	10.7%	none
4	documents → files	5.8%	16.5%	none
5	internal → local	5.2%	21.5%	none
6	base → store	3.5%	24.8%	none (just under threshold)
7	knowledge → data	7.9%	32.2%	mcp-cumulative-drift (MEDIUM)

The exact ratios are reproduced by string-utils.levenshtein() — see expected-findings.md for the testable contract.

How to run

cd plugins/llm-security
node examples/mcp-rug-pull/run-rug-pull.mjs

# Detailed: show stderr + final cache state
node examples/mcp-rug-pull/run-rug-pull.mjs --verbose

Expected: 8 pass, 0 fail. Stage 7 produces a SECURITY ADVISORY (post-mcp-verify) containing mcp-cumulative-drift and the literal phrase Slow-burn rug-pull may evade per-update detection.

Hooks / scanners involved

hooks/scripts/post-mcp-verify.mjs — the only hook invoked. Calls into scanners/lib/mcp-description-cache.mjs::checkDescriptionDrift() for the actual drift math.
scanners/lib/mcp-description-cache.mjs — the cache library. Stores { description, firstSeen, lastSeen, baseline, history } per tool. Baseline survives the 7-day TTL purge.

Cache isolation

post-mcp-verify honors LLM_SECURITY_MCP_CACHE_FILE env var (added v7.3.0 specifically for testing/demos). The script:

Creates mkdtempSync(tmpdir + 'llm-security-rugpull-')
Points the cache at a file inside that tempdir
Spawns each hook invocation with the env var set
Removes the entire tempdir in finally{} before exit

Your real ~/.cache/llm-security/mcp-descriptions.json is never touched. This is the same pattern used by the unit tests under tests/lib/mcp-description-cache.test.mjs.

Resetting baseline after a legitimate upgrade

Real MCP servers do upgrade their descriptions occasionally — that's not always an attack. After confirming the upgrade is genuine, run:

/security mcp-baseline-reset                    # clear all baselines
/security mcp-baseline-reset --target mcp__foo  # clear one tool
/security mcp-baseline-reset --list             # see current baselines

The next call to checkDescriptionDrift after a clear will re-seed the baseline from whatever incoming description appears. description, firstSeen, lastSeen, and history are preserved for audit.

OWASP / framework mapping

Code	Framework	Why
MCP05	OWASP MCP Top 10	Rug-pull / unauthorized tool description change
LLM03	OWASP LLM Top 10	Supply-chain — compromised MCP server delivers altered behavior
ASI04	OWASP Agentic Top 10	Untrusted-tool-influence on agent behavior

Limitations

The walkthrough demonstrates only the mcp-cumulative-drift MEDIUM advisory. It does not exercise:
- Per-update advisory firing (above 10% in one step) — covered by the older v6.x test suite
- Cache TTL purge (7 days) — would require time mocking
- History rolling cap (10 events FIFO) — emerges naturally over use
This is a description-only rug-pull. Behavior changes that don't show up in the description (e.g. the server returns different content while keeping its description) are detected by other layers (post-session-guard data flow tagging, post-mcp-verify content scanning of tool_output).