Kjell Tore Guttormsen 583a78c6cc feat(llm-security): add lethal-trifecta + mcp-rug-pull example contents [skip-docs]

Companion to 8df5d5c (which only carried the doc updates — the example
directories themselves were left out of staging by mistake). This
commit adds the actual example mappes:

- examples/lethal-trifecta-walkthrough/{README.md, run-trifecta.mjs,
  expected-findings.md}
- examples/mcp-rug-pull/{README.md, run-rug-pull.mjs,
  expected-findings.md}

Plus plugin CLAUDE.md "Examples (runnable demonstrations)" section
with a 4-row table covering malicious-skill-demo, prompt-injection-
showcase, lethal-trifecta-walkthrough, and mcp-rug-pull plus the
state-isolation discipline notes.

Marketplace root README unchanged since plugin's outward coverage
is unchanged ([skip-docs] covers the marketplace-level gate).

2026-05-05 14:45:39 +02:00

2.2 KiB

Raw Blame History

Expected Findings — MCP Cumulative Drift Walkthrough

This is the testable contract. run-rug-pull.mjs exits 0 only when every row matches.

Per-stage contract

Stage	per-update advisory	cumulative advisory	OWASP
0	no	no (baseline seeded)	—
1	no	no	—
2	no	no	—
3	no	no	—
4	no	no	—
5	no	no	—
6	no	no (cum=24.8%, just under 25%)	—
7	no	YES	MCP05, LLM03

The hook output is JSON {systemMessage: "..."} containing SECURITY ADVISORY (post-mcp-verify): Potential data leakage detected. followed by an enumerated advisory. The mcp-cumulative-drift advisory at stage 7 includes:

The literal phrase MCP tool cumulative description drift — MEDIUM
The OWASP tag (mcp-cumulative-drift, OWASP MCP05)
The phrase Slow-burn rug-pull may evade per-update detection
A baseline preview matching stage 0's text
A current preview matching stage 7's text
A pointer to /security mcp-baseline-reset

Drift math (verifiable)

These ratios are produced by scanners/lib/string-utils.mjs::levenshtein():

Stage	Levenshtein vs prev	Levenshtein vs baseline	per_update	cumulative
1	4	4	3.3%	3.3%
2	4	8	3.3%	6.6%
3	5	13	4.2%	10.7%
4	7	20	5.8%	16.5%
5	6	26	5.2%	21.5%
6	4	30	3.5%	24.8%
7	9	39	7.9%	32.2%

per_update threshold = 0.10 → never tripped. cumulative threshold = 0.25 → tripped at stage 7.

Cache state at end (verbose mode)

mcp__knowledge__search entry should contain:

baseline.description = stage 0 text (immutable since stage 0)
description = stage 7 text (last seen)
history.length = 7 (one entry per stage 1-7)
firstSeen and lastSeen set to runtime millis
No clearBaseline() was called, so baseline is still present

Side effects

Cache file is written to mkdtemp directory provided via env var
Cache directory is removed by finally{} block on exit
No MCP audit-trail event (audit trail not configured for this demo)
No interaction with ~/.cache/llm-security/

2.2 KiB Raw Blame History